[
  {
    "path": ".gitignore",
    "content": "jarvis/\nchatgptwrapper/\noutput.wav\n.env\n# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\nwheels/\nshare/python-wheels/\n*.egg-info/\n.installed.cfg\n*.egg\nMANIFEST\n\n# PyInstaller\n#  Usually these files are written by a python script from a template\n#  before PyInstaller builds the exe, so as to inject date/other infos into it.\n*.manifest\n*.spec\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Unit test / coverage reports\nhtmlcov/\n.tox/\n.nox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*.cover\n*.py,cover\n.hypothesis/\n.pytest_cache/\ncover/\n\n# Translations\n*.mo\n*.pot\n\n# Django stuff:\n*.log\nlocal_settings.py\ndb.sqlite3\ndb.sqlite3-journal\n\n# Flask stuff:\ninstance/\n.webassets-cache\n\n# Scrapy stuff:\n.scrapy\n\n# Sphinx documentation\ndocs/_build/\n\n# PyBuilder\n.pybuilder/\ntarget/\n\n# Jupyter Notebook\n.ipynb_checkpoints\n\n# IPython\nprofile_default/\nipython_config.py\n\n# pyenv\n#   For a library or package, you might want to ignore these files since the code is\n#   intended to run in multiple environments; otherwise, check them in:\n# .python-version\n\n# pipenv\n#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.\n#   However, in case of collaboration, if having platform-specific dependencies or dependencies\n#   having no cross-platform support, pipenv may install dependencies that don't work, or not\n#   install all needed dependencies.\n#Pipfile.lock\n\n# poetry\n#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.\n#   This is especially recommended for binary packages to ensure reproducibility, and is more\n#   commonly ignored for libraries.\n#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control\n#poetry.lock\n\n# pdm\n#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.\n#pdm.lock\n#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it\n#   in version control.\n#   https://pdm.fming.dev/#use-with-ide\n.pdm.toml\n\n# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm\n__pypackages__/\n\n# Celery stuff\ncelerybeat-schedule\ncelerybeat.pid\n\n# SageMath parsed files\n*.sage.py\n\n# Environments\n.env\n.venv\nenv/\nvenv/\nENV/\nenv.bak/\nvenv.bak/\nvicuna/oobabooga-windows/\n\n# Spyder project settings\n.spyderproject\n.spyproject\n\n# Rope project settings\n.ropeproject\n\n# mkdocs documentation\n/site\n\n# mypy\n.mypy_cache/\n.dmypy.json\ndmypy.json\n\n# Pyre type checker\n.pyre/\n\n# pytype static type analyzer\n.pytype/\n\n# Cython debug symbols\ncython_debug/\n\n# PyCharm\n#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can\n#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore\n#  and can be added to the global gitignore or merged into this file.  For a more nuclear\n#  option (not recommended) you can uncomment the following to ignore the entire idea folder.\n#.idea/\n"
  },
  {
    "path": "Assistant/Agents.py",
    "content": "from langchain import OpenAI, LLMChain\nfrom langchain.llms import OpenAI\nfrom langchain.agents import Tool, AgentExecutor, ZeroShotAgent\nfrom langchain.memory import ConversationBufferMemory \nfrom langchain.agents import initialize_agent, load_tools\nimport datetime\n\nfrom Assistant import VirtualAssistant\nimport os\n\n# generate a Zero Shot React Agent with memory that looks K interactions behind\ndef generateReactAgent(VA:VirtualAssistant, k:int):\n    # Local Search Engine ($)\n    LocalSearchEngine = Tool(\n        name= 'Key Search',\n        func=VA.find_file,\n        description=\"Useful when you don't know the name of a resource. Inputs should be keywords. Keywords are used to find resources. You ddon't know the name of the resources\"\n    )\n\n    Save = Tool(name='Memorize', \n             func=VA.save_chat,\n             description='save the current conversation. Useful for when the conversation will be needed in future')\n\n    FileReader = Tool(\n        name='Load File',\n        func=VA.open_file,\n        description='Useful when you have file names. Loads the content of a file given its filename'\n    )\n    \n    Summarize = Tool(\n        name='TLDR',\n        func=VA.search_engine.tldr,\n        description='Summarize large amounts of text'\n    )\n    tools = [LocalSearchEngine, Save, FileReader, Summarize]\n    \n    memory = ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True, human_prefix='user', ai_prefix='assistant')\n\n    # need to work on a custom LangChain llm model \n    prefix = \"\"\"You are an AI research assistant designed to assist users with their academic research. You are equipped with these tools:\"\"\"\n    suffix = \"\"\"Begin!\"\n\n    {chat_history}\n    Question: {input}\n    {agent_scratchpad}\"\"\"\n\n    prompt = ZeroShotAgent.create_prompt(\n        tools, \n        prefix=prefix, \n        suffix=suffix, \n        input_variables=[\"input\", \"chat_history\", \"agent_scratchpad\"]\n    )\n    llm_chain = LLMChain(llm=OpenAI(temperature=0), prompt=prompt)\n    agent = ZeroShotAgent(llm_chain=llm_chain, tools=tools, verbose=True)\n\n    # adding a window of memory:\n    memory = build_memory(chat_history = VA.current_conversation(), k=k)\n    \n    return AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True, memory=memory, early_stopping_method ='generate', max_iterations=2)\n\n\n\ndef build_memory(chat_history, k):\n    memory = ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True, human_prefix='user', ai_prefix='assistant')\n    k = min(k, len(chat_history)//2)\n\n    if k==0 :return memory\n\n    if chat_history[-k][\"role\"] != 'user': \n        print('refreshing memory warning - considering last interaction only')\n        k=1\n    try:\n        for i in range(-k*2-1, -1, 2):\n            input  = chat_history[i]['content']\n            output = chat_history[i+1]['content']\n            memory.save_context({\"input\":input}, {\"output\":output})\n    except:\n        memory = ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True, human_prefix='user', ai_prefix='assistant')\n    return memory\n\n\n\ndef generateGoogleAgent(VA:VirtualAssistant, k:int):\n    names = ['wikipedia', 'requests', 'open-meteo-api']\n    llm = OpenAI(temperature=0)\n    if len(os.getenv('SERPER_API_KEY'))>1:\n        print('(using google-seper)')\n        names.append('google-serper')\n    elif len(os.getenv('GOOGLE_API_KEY'))>1:\n        print('(using google-serch)')\n        names.append('google-search')\n    \n    tools = load_tools(names, llm=llm)\n    custom_tools = [\n        Tool(\n            name ='Locate me',\n            func = locate_me, \n            description='useful to know the current geographical location'),\n        Tool(\n            name='News',\n            func=news,\n            description='Use this when you want to get information about the top headlines of current news stories. The input should be a keyword describing the topic'),\n        Tool(\n            name='Today',\n            func=today,\n            description='Useful to know the current day'),\n        Tool(\n            name='Delta days',\n            func= time_between_dates,\n            description='Use this you need to compute the time between two Dates. Input should be two dates in the ISO 8601 format: Year-Month-Day'\n        )\n        ]\n    \n    for item in custom_tools: tools.append(item)\n        \n    prefix = \"\"\"Answer the question. You have also access to the following tools:\"\"\"\n    suffix = \"\"\"Begin!\"\n\n    {chat_history}\n    Question: {input}\n    {agent_scratchpad}\"\"\"\n\n    prompt = ZeroShotAgent.create_prompt(\n        tools, \n        prefix=prefix, \n        suffix=suffix, \n        input_variables=[\"input\", \"chat_history\", \"agent_scratchpad\"]\n    )\n    llm_chain = LLMChain(llm=OpenAI(temperature=0), prompt=prompt)\n    agent = ZeroShotAgent(llm_chain=llm_chain, tools=tools, verbose=True)\n\n    # adding a window of memory:\n    memory = build_memory(VA.current_conversation(), k)\n    \n    return AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True, memory=memory, early_stopping_method = 'generate', max_iterations=4)\n\n\n\n## FUNCTIONS FOR TOOLS\nimport geocoder\nfrom newsapi import NewsApiClient\n\ndef locate_me(p):\n    g = geocoder.ip('me')\n    return [g.city, g.state, g.country]\n\ndef today(p):\n    return str(datetime.date.today())\n\ndef time_between_dates(date1, date2):\n    try:\n        date1 = datetime.date.fromisoformat(date1)\n        date2 = datetime.date.fromisoformat(date2)\n    except:\n        return 'date format incorrect'\n    if date2.toordinal()>date1.toordinal(): return str(date2-date1)\n    else: return str(date1-date2) \n\ndef news(keyword):\n    newsapi = NewsApiClient(api_key=os.getenv('NEWS_API_KEY'))\n    top_headlines = newsapi.get_top_headlines(q=keyword)\n    if len(top_headlines['articles'])==0:\n        top_headlines = newsapi.get_everything(q=keyword)\n        top_headlines['articles'] = top_headlines['articles'][0:min(len(top_headlines['articles']),10)]\n\n    res = ''\n\n    for article in top_headlines['articles']:\n        res += '\\nsource: '+article['source']['name']+'\\n'\n        res += '\\n'+article['title']+'\\nurl: '+article['url']+'\\n'\n        res += article['description']\n\n    print(res)\n    return res"
  },
  {
    "path": "Assistant/Chat.py",
    "content": "\n"
  },
  {
    "path": "Assistant/VirtualAssistant.py",
    "content": "# import for prompt routing\r\nfrom langchain import OpenAI\r\nfrom langchain.agents import Tool\r\nfrom langchain.agents import initialize_agent\r\n\r\nfrom Assistant.research_mode import ResearchAssistant\r\nfrom .Agents import generateReactAgent, generateGoogleAgent\r\nimport tiktoken\r\n\r\n# imports for chats\r\nimport  pygame\r\nimport os\r\nimport re\r\n\r\nimport pandas as pd\r\nfrom datetime import datetime\r\nimport copy\r\nimport openai\r\nimport time\r\nimport langid\r\nimport torch\r\n\r\nimport Assistant.get_audio as myaudio\r\nfrom .voice import *\r\nfrom .tools import Translator, LocalSearchEngine, AssistantChat\r\nfrom .tools import parse_conversation, count_tokens, take_last_k_interactions\r\nfrom .webui import oobabooga_textgen\r\n\r\n# imports for audio\r\nimport whisper\r\nimport wave\r\nimport pyaudio\r\nimport speech_recognition as sr\r\nimport time\r\nimport sys\r\nfrom contextlib import contextmanager\r\n\r\n#module used for speaking during recording\r\nimport webrtcvad\r\n      \r\nclass VirtualAssistant:\r\n    DEBUG = True\r\n    DEFAULT_CHAT =  AssistantChat([{\"role\": \"system\", \"content\": \"You are a helpful assistant. You can make question to make the conversation entertaining.\"}])\r\n    RESPONSE_TIME = 1.5 #values that work well in my environment (ticks, not seconds)\r\n    SLEEP_DELAY = 3 #seconds \r\n    MIN_RECORDING_TIME = .5 #seconds\r\n    MAX_RECORDING_TIME = 60 #seconds\r\n    VAD_AGGRESSIVENESS = 2  #1-3\r\n\r\n    MAX_TOKENS = 4096\r\n\r\n    DEVICE_INDEX = myaudio.detect_microphones()[0]\r\n    CHUNK = 1024\r\n    FORMAT = pyaudio.paInt16\r\n    CHANNELS = myaudio.get_device_channels()[DEVICE_INDEX]\r\n    RATE = int(myaudio.get_devices()[DEVICE_INDEX]['defaultSampleRate'])\r\n\r\n    print('using input device: ', myaudio.get_devices()[DEVICE_INDEX]['name'])\r\n\r\n    CONVERSATION_LONG_ENOUGH = 4 #interactions (2 questions)\r\n\r\n    def __init__(self, \r\n                 whisper_model=None, \r\n                 awake_with_keywords = ['elephant'],\r\n                 model = \"gpt-3.5-turbo\",\r\n                 embed_model = \"text-embedding-ada-002\",\r\n                 translator_model = 'argostranslator',\r\n                 \r\n                 **kwargs):\r\n        try:\r\n            openai.api_key = kwargs['openai_api']\r\n        except:\r\n            print('OpenAI API key not found')\r\n        \r\n        # HEAVY STUFF FIRTS\r\n        # Filling the GPU with the model\r\n        if whisper_model == None:\r\n            if 'whisper_size' not in kwargs: raise Exception('whisper model needs to be specified')\r\n            self.interpreter = whisper.load_model(kwargs['whisper_size'])\r\n        else:\r\n            self.interpreter = whisper_model\r\n\r\n        # STATUS and PROMPT ANALYZER\r\n        if 'mode' in kwargs: \r\n            if kwargs['mode'].upper() != 'CHAT' and  kwargs['mode'].upper() != 'RESEARCH': raise KeyError()\r\n            self.MODE = kwargs['mode']\r\n        self.DIRECTORIES={\r\n            'CHAT_DIR': os.path.realpath(os.path.join(os.getcwd(), 'saved_chats')),\r\n            'SOUND_DIR':os.path.realpath(os.path.join(os.getcwd(), 'Assistant', 'sounds')),\r\n            'VOICE_DIR':os.path.realpath(os.path.join(os.getcwd(), 'Assistant', 'voices'))\r\n        }\r\n\r\n        self.func_descript={\r\n            \"CHAT\":[\r\n                \"tools: the prompt requires an action like handling a file, saving a conversation, changing some specified parameters...\",\r\n                \"respond: provide an answer to a question\",\r\n                \"you don't know the answer or you can't satisfy the request.\"],\r\n            \"RESEARCH\":[\r\n                \"tools: the prompt requires one or multiple actions like reading a file, downloading a known resource\",\r\n                \"respond: provide an answer based on scientific information\",\r\n            ]\r\n        }\r\n\r\n\r\n        # TEXT and VOICE\r\n        if 'voice_id' in kwargs.keys():\r\n            for item in kwargs['voice_id']:\r\n                print(kwargs['voice_id'][item])\r\n                kwargs['voice_id'][item] = (os.path.join(self.DIRECTORIES[\"VOICE_DIR\"], kwargs['voice_id'][item])) + '.wav'\r\n        else:\r\n            kwargs['voice_id'] = os.path.join(self.DIRECTORIES[\"VOICE_DIR\"], 'default.wav')\r\n\r\n        self.languages = {\r\n            'en': \"English\",\r\n            'it': \"Italian\",\r\n            # add yours\r\n        }\r\n\r\n\r\n        self.voice = Voice(write_dir = self.DIRECTORIES['SOUND_DIR'], languages = self.languages, **kwargs)\r\n        self.translator = Translator(model=translator_model, translator_languages = list(self.languages.keys()))\r\n        self.answer_engine = model\r\n\r\n        self.search_engine = LocalSearchEngine(\r\n            embed_model = embed_model, \r\n            tldr_model = kwargs['search_engine_llm'] if 'search_engine_llm' in list(kwargs.keys()) else model,\r\n            translator_model=translator_model,\r\n            translator_languages = list(self.languages.keys()))\r\n        \r\n        self.is_awake = False\r\n        self.current_conversation = self.DEFAULT_CHAT\r\n\r\n        # AUDIO \r\n        # initialize the VAD module\r\n        self.vad = webrtcvad.Vad()\r\n        self.vad.set_mode(self.VAD_AGGRESSIVENESS) \r\n        \r\n        if 'awake_with_keywords' in kwargs:\r\n            self.Keywords = kwargs['awake_with_keywords']\r\n        else:\r\n            self.Keywords = awake_with_keywords\r\n        self.Keywords = awake_with_keywords\r\n        self.ears = sr.Recognizer()\r\n\r\n\r\n        # init finished\r\n        self.play('system_online_bleep.mp3')\r\n    \r\n    # STATUS ###############################################################################################\r\n    def switch_mode(self):\r\n        if self.MODE == 'CHAT': \r\n            self.say('Moving into research mode', VoiceIdx='en', elevenlabs=True)\r\n            self.play('Sci-Fi-UI.mp3',loop=True)\r\n            self.init_research_mode()\r\n            pygame.mixer.stop()\r\n            self.play('system_online_bleep.mp3', PlayAndWait=True)\r\n            response =  self.translator.translate('research mode is ready', to_language=langid.classify(self.current_conversation[-1]['content']), from_language='en').lower()\r\n            return response\r\n        else:\r\n            self.MODE = 'CHAT'\r\n            response =  self.translator.translate('chat mode enabled', to_language=langid.classify(self.current_conversation[-1]['content'])[0],from_language='en').lower()\r\n            return response\r\n\r\n    def identify_explicit_command(self, prompt):\r\n        prompt = self.translator.translate(prompt, to_language='en').lower()\r\n        \r\n        # if the prompt is long it's unlikely to be an explicit command \r\n        # (this condition prevents false positives)\r\n        if len(prompt.split())>15: return\r\n        \r\n        INTERNET_COMMANDS = [\r\n            \"do an internet search\",\r\n            \"look on the web\",\r\n            \"do a web search\",\r\n            \"control on the internet\",\r\n            \"do a search\",\r\n            \"make a search\",\r\n            \"perform a search\",\r\n            \"perform a web search\"]\r\n        \r\n        if (\"research mode\" in prompt and self.MODE=='CHAT') or (\"chat mode\" in prompt and self.MODE=='RESEARCH'):\r\n            print('found explicit command')\r\n            return '-1'\r\n        \r\n        if self.MODE == 'CHAT':\r\n            if any(command in prompt for command in INTERNET_COMMANDS):\r\n                print('found explicit command')\r\n                return '3'\r\n        \r\n        if self.MODE == 'RESEARCH':\r\n            if \"new workspace\" in prompt or \"new environment\" in prompt:\r\n                print('found explicit command')\r\n                return '3'\r\n        \r\n    \r\n    def use_tools(self, prompt, debug = DEBUG):\r\n        if debug: print(' -use tools ')\r\n        # tools for chat mode\r\n        if self.MODE == \"CAHT\":\r\n            ActionManager = generateReactAgent(self, k=1)\r\n            return ActionManager.run(input = prompt)\r\n        \r\n        # research mode\r\n        else:\r\n            return self.ResearchAssistant.agent.run(input = prompt)\r\n            \r\n    \r\n    def secondary_agent(self, prompt, debug = DEBUG): \r\n        if self.MODE == 'CHAT':\r\n            if debug: print(' - web surfing ')   \r\n            WebSurfingAgent = generateGoogleAgent(self, k=1)\r\n            return WebSurfingAgent.run(prompt)\r\n        if self.MODE == 'RESEARCH':\r\n            if debug: print(' - assessing new workspace ') \r\n            return self.ResearchAssistant.PROTOCOL_begin_new_workspace(prompt)\r\n        \r\n\r\n    def set_directories(self, **kwargs):\r\n        for item in kwargs:\r\n            try:\r\n                if not(os.path.isdir(kwargs[item])): raise Exception\r\n\r\n                print(f'updating {item} from {self.DIRECTORIES[item.upper()]} == to => {kwargs[item]}')\r\n                self.DIRECTORIES[item.upper()] = kwargs[item]\r\n                \r\n            except:\r\n                self.play('error.mp3', PlayAndWait=True)\r\n                print(f\"{kwargs[item]}: not found\")\r\n\r\n    def go_to_sleep(self):\r\n        print('[Assistant going to sleep]')\r\n        self.is_awake = False\r\n        if len(self.current_conversation()) > self.CONVERSATION_LONG_ENOUGH:\r\n            self.save_chat()\r\n\r\n        self.play('sleep.mp3', PlayAndWait=True)\r\n\r\n    # [stable]\r\n    def analyze_prompt(self, prompt, debug = DEBUG):\r\n        if debug: print(f' - analyzing prompt in {self.MODE} mode')\r\n        # Hard coded options: DO this, Look on INTERNET...\r\n        flag = self.identify_explicit_command(prompt)\r\n        if flag is not None: return flag\r\n\r\n        # CHAT MODE \r\n        if self.MODE == 'CHAT':\r\n            context =\"\"\"You are a prompt manager. A number must always be present in your answer. You can perform some actions and decide which associated number is required. Your actions:\"\"\"\r\n            for i, function in enumerate(self.func_descript['CHAT']):\r\n                context += f\"\\n{i+1}) {function};\"\r\n            context += \"\\nYou can answer only with numbers. A number must always be present in your answer.\"\r\n            context += \"\"\"\\nHere are some example:\r\n            \\nPROMPT: 'find and summarize all the files about history'\\n1\r\n            \\nPROMPT: 'find a past conversation about planes'\\n1\r\n            \\nPROMPT: 'do you agree?'\\n2\r\n            \\nPROMPT: 'Salva questa conversazione'\\n1\r\n            \\nPROMPT: 'How is the weather?'\\n3\r\n            \\nPROMPT: 'credo sia giusto.'\\n2\r\n            \\nPROMPT: '¿Cuál es la noticia de hoy?'\\n3\r\n            \\nPROMPT: 'Thank you'\\n2\"\"\"\r\n\r\n            CHAT = [{\"role\": \"system\", \"content\": context},\r\n                    {\"role\": \"user\", \"content\":f\"PROMPT: '{prompt}'\"}]\r\n\r\n            flag = self.identify_explicit_command(prompt)\r\n\r\n        # RESEARCH MODE    \r\n        else:\r\n            context =\"\"\"You are a prompt manager. A number must always be present in your answer. You can perform some actions and decide which associated number is required. You are designed to assist users with their academic research. You are equipped with a range of tools. Your tools:\"\"\"\r\n            for i, function in enumerate(self.func_descript['RESEARCH']):\r\n                context += f\"\\n{i+1}) {function};\"\r\n            context += \"\\nYou can answer only with numbers. A number must always be present in your answer.\"\r\n            context += \"\"\"\\nHere are some example:\r\n            \\nPROMPT: 'begin a new project'\\n1\r\n            \\nPROMPT: 'download papers about ...'\\n1\r\n            \\nPROMPT: 'what are the mechanichal properties of carbon fiber?'\\n2\r\n            \\nPROMPT: 'Salva questa conversazione'\\n1\r\n            \\nPROMPT: 'What are the authors of the paper XYZ?'\\n2\r\n            \\nPROMPT: 'What studies mention Transformers architectures?'\\n2\r\n            \\nPROMPT: 'Find new papers that are similar to paper XYZ'\\n1\r\n            \\nPROMPT: 'Tell me more'\\n2\r\n            \\nPROMPT: 'what is up?\\n2\"\"\"\r\n\r\n            CHAT = [{\"role\": \"system\", \"content\": context},\r\n                    {\"role\": \"user\", \"content\":f\"PROMPT: '{prompt}'\"}]\r\n\r\n\r\n        if debug: print(' - - submitting request')\r\n        response = openai.ChatCompletion.create(\r\n                    model=\"gpt-3.5-turbo\",\r\n                    temperature=0,\r\n                    max_tokens=10,\r\n                    messages=CHAT)\r\n        flag = response['choices'][0]['message']['content']\r\n        if debug: print(' - - got answer')\r\n        \r\n        return flag\r\n\r\n    # CONVERSATION ################################################################################\r\n\r\n    def start_new_conversation(self):\r\n        if len(self.current_conversation)>2: \r\n            print('forgetting the last conversation')\r\n        self.current_conversation = self.DEFAULT_CHAT\r\n\r\n    def expand_conversation(self, role, content): self.current_conversation.append({\"role\":role, \"content\":content})\r\n\r\n    def get_answer(self, question, optimize_cuda = False, debug=DEBUG):\r\n        if debug: print(' - thinking')\r\n        if self.MODE == \"CHAT\":\r\n            temp = copy.deepcopy(self.current_conversation())\r\n            temp.append({\"role\":\"user\", \"content\":question})\r\n\r\n            self.play('thinking.mp3', loop=True)\r\n\r\n            if self.answer_engine == 'gpt-3.5-turbo':\r\n                if debug: print(' - - submitting request')\r\n                API_response = openai.ChatCompletion.create(\r\n                    model=self.answer_engine,\r\n                    messages=temp)\r\n                answer = API_response['choices'][0]['message']['content']\r\n                if debug: print(' - - got answer')\r\n                \r\n            \r\n            elif self.answer_engine == 'anon8231489123_vicuna-13b-GPTQ-4bit-128g':\r\n                lang_id = langid.classify(question)[0]\r\n                if optimize_cuda:\r\n                    # free space on the GPU\r\n                    self.deallocate_whisper()\r\n                # use GPU to process the answer\r\n                answer = oobabooga_textgen(prompt = temp)\r\n                answer = self.translator.translate(answer, from_language=langid.classify(answer)[0], to_language=lang_id)\r\n                if optimize_cuda:\r\n                    # try to get the model back to GPU\r\n                    self.allocate_whisper()\r\n\r\n            elif self.answer_engine == 'eachadea_ggml-vicuna-13b-4bit':\r\n                answer = oobabooga_textgen(prompt = question)\r\n        \r\n        # RESEARCH MODE\r\n        else:\r\n            if self.ResearchAssistant.query_engine == None:\r\n                return 'error: no workspace loaded. I cannot provide precise information without a workspace loaded on research mode'\r\n            res = self.ResearchAssistant.query_engine.query(question)\r\n            answer = res.response\r\n        pygame.mixer.stop()\r\n\r\n        self.expand_conversation(role=\"assistant\", content=answer)\r\n\r\n        self.last_interaction = time.perf_counter()\r\n        if debug: print(' - - finished')\r\n        return answer\r\n\r\n    def save_chat(self, debug = DEBUG):\r\n        if debug: print(' - saving chat')\r\n        if not os.path.isdir(self.DIRECTORIES['CHAT_DIR']): os.mkdir(self.DIRECTORIES['CHAT_DIR'])\r\n\r\n        if not self.current_conversation.is_saved(): \r\n            if debug: print(' - - generating title')\r\n            title = self.get_answer(question=\"generate a very short title for this conversation\")\r\n            self.say(f'I am saving this conversation with title: {title}', VoiceIdx='en', IBM=False, elevenlabs=True)\r\n\r\n            self.play('data_writing.mp3', PlayAndWait=True)\r\n            \r\n            prompt =  [{\"role\": \"system\", \"content\": \"You don't like redundancy and use as few words as possible\"},\r\n                {\"role\":\"user\", \"content\":f\"Associate a tag to this title: {title} \\nHere is an example: 'Exploring Text to Speech Popular Techniques and Deep Learning Approaches' is associated to 'Deep Learning'\"}]\r\n            \r\n            if debug: print(' - - submitting request')\r\n            API_response = openai.ChatCompletion.create(\r\n                        model='gpt-3.5-turbo',\r\n                        max_tokens=5,\r\n                        temperature=0,\r\n                        messages=prompt)\r\n            if debug: print(' - - got answer')\r\n            if debug: print(' - - processing response')\r\n            answer = API_response['choices'][0]['message']['content']\r\n            answer = re.sub(r'[^\\w\\s]', '',answer)\r\n            answer = re.sub(' ', '',answer)\r\n            fname = str( str(datetime.today().strftime('%Y-%m-%d')) + '_' + str(answer)+'.txt')\r\n            self.current_conversation.filename = fname\r\n        else:\r\n            self.say(f'I am overwriting the conversation {fname}', VoiceIdx='en', IBM=False, elevenlabs=True)\r\n            fname = self.current_conversation.filename\r\n\r\n        with open(os.path.join(self.DIRECTORIES['CHAT_DIR'], fname), 'w') as f:\r\n            for message in self.current_conversation():\r\n                f.write(message[\"role\"]+ ': ' + message[\"content\"]+'\\n')\r\n            f.close()\r\n        \r\n        self.is_awake = False\r\n        return f\"file: {os.path.join(self.DIRECTORIES['CHAT_DIR'], fname)} saved successfully\"\r\n\r\n\r\n    # ACTIONS ##################################################################################\r\n    def init_research_mode(self, workspace=None):\r\n        \r\n        if workspace is None:\r\n            # get last created workspace\r\n            if 'workspaces' in os.listdir(os.getcwd()):\r\n                search_dir = os.path.join('workspaces')\r\n                subdirs = os.listdir(search_dir)\r\n                subdirs.sort(key=lambda fn: os.path.getmtime(os.path.join(search_dir, fn)))\r\n                subdirs.reverse()\r\n                for subd in subdirs:\r\n                    folder_path = os.path.join('workspaces',subd)\r\n                    if os.path.isdir(folder_path):\r\n                        self.say('loading the last created workspace', VoiceIdx='en', elevenlabs=True)\r\n                        workspace = os.path.abspath( folder_path )\r\n                        break\r\n\r\n        self.play('Sci-Fi-UI.mp3',loop=True)\r\n        self.MODE = 'RESEARCH'\r\n        self.ResearchAssistant = ResearchAssistant(\r\n            current_conversation=self.current_conversation,\r\n            index_name='paperquestioning',\r\n            workspace=workspace)\r\n        pygame.mixer.stop()\r\n        \r\n    def deallocate_whisper(self):\r\n        model_name = self.interpreter.name\r\n        model_current_device = self.interpreter.device\r\n\r\n        self.interpreter = None\r\n        torch.cuda.empty_cache()\r\n\r\n        if model_current_device.type == 'cuda':\r\n            print('loading Whisper model to cpu')\r\n            self.interpreter = whisper.load_model(model_name, device='cpu')\r\n        torch.cuda.empty_cache()\r\n\r\n    def allocate_whisper(self):\r\n        model_name = self.interpreter.name\r\n        model_current_device = self.interpreter.device\r\n        self.interpreter = None\r\n\r\n        torch.cuda.empty_cache()\r\n        if model_current_device.type == 'cpu':\r\n            try:\r\n                torch.cuda.empty_cache()\r\n                print('loading Whisper model to CUDA')\r\n                self.interpreter = whisper.load_model(model_name, device='cuda')\r\n            except:\r\n                self.interpreter = None\r\n                print(f\"cuda dedicated memory isufficient: {torch.cuda.memory_allocated()/1e6} GB already occupuied\")\r\n                print(f\"keeping Whisper model to cpu\")\r\n                self.interpreter = whisper.load_model(model_name, device='cpu')\r\n        torch.cuda.empty_cache()\r\n                \r\n    def switch_whisper_device(self):\r\n        model_name = self.interpreter.name\r\n        model_current_device = self.interpreter.device\r\n\r\n        self.interpreter = None\r\n        torch.cuda.empty_cache()\r\n\r\n        if model_current_device.type == 'cuda':\r\n            print('loading Whisper model to cpu')\r\n            self.interpreter = whisper.load_model(model_name, device='cpu')\r\n            torch.cuda.empty_cache()\r\n        else:\r\n            try:\r\n                torch.cuda.empty_cache()\r\n                print('loading Whisper model to CUDA')\r\n                self.interpreter = whisper.load_model(model_name, device='cuda')\r\n            except:\r\n                print(f\"cuda dedicated memory isufficient: {torch.cuda.memory_allocated()/1e6} GB already occupuied\")\r\n                print(f\"keeping Whisper model to cpu\")\r\n                self.interpreter = whisper.load_model(model_name, device='cpu')\r\n                torch.cuda.empty_cache()\r\n    \r\n    def open_file(self, filename, debug=DEBUG):\r\n        if debug: print(' - opening file')\r\n        # look for the file\r\n        file = None\r\n        for fname in os.listdir(self.DIRECTORIES['CHAT_DIR']):\r\n            \r\n            # look for sub-strings (in case extension is forgotten)\r\n            if filename in fname:\r\n                file = open(os.path.join(self.DIRECTORIES['CHAT_DIR'], filename), 'r')\r\n                file = file.read()\r\n\r\n        if file is None: return 'No such file'\r\n\r\n        return file  \r\n   \r\n\r\n    def find_file(self, keywords, n=3, debug=DEBUG):\r\n        if debug: print(' -finding file')\r\n        #self.play('thinking.mp3', loop=True)\r\n        summary = self.search_engine.accurate_search(key=keywords, from_csv=True, n=n)\r\n        # self.play('wake.mp3')\r\n\r\n        response = ''\r\n        for i in range(n):\r\n            response += f\"\\nFilename: {summary.file_names[i]} ; Topics discussed: {summary.tags[i]}\" \r\n        return response\r\n\r\n\r\n\r\n\r\n    # SPEAK ####################################################################################\r\n    def play(self, fname, PlayAndWait=False, loop=False, debug = DEBUG):\r\n        if loop: loop=-1\r\n        else: loop = 0\r\n\r\n        if pygame.mixer.get_init() is None: pygame.mixer.init()\r\n        if debug: print(' - playing')\r\n        try:\r\n            pygame.mixer.music.load(os.path.join(self.DIRECTORIES[\"SOUND_DIR\"], fname))\r\n        except Exception as e:\r\n            print(e)\r\n            return\r\n        pygame.mixer.music.set_volume(0.5)\r\n        pygame.mixer.music.play(loops=loop)\r\n\r\n        if PlayAndWait:\r\n            while(pygame.mixer.music.get_busy()):pass\r\n        if debug: print(' - -  finihed playing')\r\n\r\n    def say(self, text, VoiceIdx='jarvis', elevenlabs=False, IBM=False):  \r\n        langIdx = langid.classify(text)[0]\r\n\r\n        print(f\"[Assistant]: {text}\")\r\n        if elevenlabs and IBM: raise(Exception('IBM and ElevenLabs can t be both true'))\r\n        \r\n        if elevenlabs:\r\n            try:\r\n                try: \r\n                    self.voice.speak(text=text, VoiceIdx=langIdx, elevenlabs=True, IBM=False, mode='online')\r\n                    return\r\n                except Exception as e:\r\n                    print(f\"couldn t speak with: {e}\")\r\n                    self.voice.speak(text=text, VoiceIdx=langIdx, elevenlabs=False, IBM=True, mode='online')\r\n                    return\r\n            except:\r\n                self.voice.speak(text=text, VoiceIdx=VoiceIdx, elevenlabs=False, IBM=False, mode='offline')\r\n                return\r\n        \r\n        elif IBM:\r\n            try:\r\n                try: \r\n                    self.voice.speak(text=text, VoiceIdx=langIdx, elevenlabs=False, IBM=True, mode='online')\r\n                    return\r\n                except:\r\n                    self.voice.speak(text=text, VoiceIdx=langIdx, elevenlabs=True, IBM=False, mode='online')\r\n                    return\r\n            except:\r\n                self.voice.speak(text=text, VoiceIdx=VoiceIdx, elevenlabs=False, IBM=False, mode='offline')\r\n                return\r\n        \r\n        try:\r\n            self.voice.speak(text=text, VoiceIdx='jarvis',elevenlabs=False, IBM=False, mode='offline')\r\n        except Exception as e:\r\n            self.voice.speak(text=text, VoiceIdx=langIdx, elevenlabs=False, IBM=False, mode='offline')\r\n            print(VoiceIdx, elevenlabs, IBM)\r\n            print(e)\r\n            raise Exception('No such specifications')\r\n\r\n\r\n    # LISTEN #############################################################################################\r\n\r\n    \r\n    #function that blocks the code until the wakeword, or wakewords are encountered\r\n    def block_until_wakeword(self, verbosity=False):        \r\n        if verbosity: print(\"listening passively...\", end=\"\")\r\n\r\n        from struct import unpack_from\r\n        import pvporcupine\r\n\r\n        #initialize values\r\n        porcupine = None\r\n        pa = None\r\n        audio_stream = None\r\n\r\n        try:\r\n            porcupine = pvporcupine.create(access_key=os.environ[\"PORCUPINE_KEY\"], \r\n                                        keywords=self.Keywords)\r\n\r\n            pa = pyaudio.PyAudio()\r\n\r\n            audio_stream = pa.open(\r\n                rate=porcupine.sample_rate,\r\n                channels=1,\r\n                format=pyaudio.paInt16,\r\n                input=True,\r\n                frames_per_buffer=porcupine.frame_length)\r\n\r\n            #not strictly necessary, but helps debug if something overwrote the keywords\r\n            print(f\"Listening for wake word '{self.Keywords[0]}'...\")\r\n\r\n            #loop to preform while waiting(does not noticeably use the CPU)\r\n            while True:\r\n                pcm = audio_stream.read(porcupine.frame_length)\r\n                pcm = unpack_from(\"h\" * porcupine.frame_length, pcm)\r\n\r\n                keyword_index = porcupine.process(pcm)\r\n\r\n                #same actions activated as previous function\r\n                #NOTE: keyword_index is -1 unless wakeword encountered, then it's the index of the wakeword in the list\r\n                #(different wakewords activate different profiles?)\r\n                if keyword_index >= 0:\r\n                    print(\"wakeword encountered\")\r\n                    self.start_new_conversation()\r\n                    self.play('wake.mp3',PlayAndWait=False)\r\n                    self.is_awake = True\r\n                    return\r\n        finally:\r\n            #clean up\r\n            if audio_stream is not None:\r\n                audio_stream.close()\r\n            if pa is not None:\r\n                pa.terminate()\r\n            if porcupine is not None:\r\n                porcupine.delete()\r\n     \r\n    def listen_passively(self, verbosity=False):\r\n        with sr.Microphone() as source:\r\n            if verbosity: print(\"listenting passively...\", end=\"\")\r\n            audio = self.ears.listen(source)\r\n            query = ''\r\n\r\n            try: \r\n                query = self.ears.recognize(audio)\r\n                if verbosity: print(f\"user said: {query}\")\r\n            except Exception as e:\r\n                if verbosity: print(str(e))\r\n        \r\n        # if any keyword is present in the query return True (awake the assistant)\r\n        if any(keyword in query.split() for keyword in self.Keywords):\r\n            self.start_new_conversation()\r\n            self.play('wake.mp3',PlayAndWait=False)\r\n            self.is_awake = True\r\n        \r\n    def record_to_file(self, file_path):\r\n        wf = wave.open(file_path, 'wb', )\r\n        wf.setnchannels(self.CHANNELS)\r\n        sample_width = pyaudio.PyAudio().get_sample_size(self.FORMAT)\r\n        wf.setsampwidth(sample_width)\r\n        frames = self.record()\r\n        \r\n        wf.setframerate(self.RATE)\r\n        wf.writeframes(b''.join(frames))\r\n        wf.close()\r\n    \r\n    def record(self):\r\n        # Your current setup\r\n        vad_rate = 32000\r\n        frame_length_ms = 20\r\n        vad_CHUNK = (vad_rate * frame_length_ms) // 1000\r\n\r\n\r\n        p = pyaudio.PyAudio()\r\n        vad_stream = p.open(format=self.FORMAT,\r\n                        channels=1,\r\n                        rate=vad_rate,\r\n                        input=True,\r\n                        frames_per_buffer=vad_CHUNK)\r\n        \r\n        rec_stream = p.open(format=self.FORMAT,\r\n                        channels=self.CHANNELS,\r\n                        rate=self.RATE,\r\n                        input=True,\r\n                        frames_per_buffer=self.CHUNK)\r\n\r\n        frames = []\r\n        try:\r\n            silence_time = 0\r\n            speaked = False\r\n            is_voice = False\r\n            print(\"listening...\")\r\n            \r\n            start_time = time.perf_counter()\r\n            while True:\r\n                rec_data = rec_stream.read(self.CHUNK)\r\n                frames.append(rec_data)\r\n\r\n                # detect voice activity\r\n                data = vad_stream.read(vad_CHUNK) \r\n                                       \r\n                try:\r\n                    is_voice = self.vad.is_speech(data, vad_rate) \r\n                except Exception as e:\r\n                    print(f\"Error during VAD: {e}\")\r\n\r\n                # Calculate time since the last voice activity\r\n                if is_voice and (time.perf_counter()-start_time)>self.MIN_RECORDING_TIME:\r\n                    speaked = True\r\n                    silence_time = 0\r\n                else:\r\n                    silence_time += frame_length_ms / 1000\r\n\r\n                # Print debugging information (useful for tuning sensitivity)\r\n\r\n                # Stop recording if silence duration exceeds the threshold or if the time limit is reached\r\n                if (silence_time > self.RESPONSE_TIME and speaked) or (time.perf_counter() - start_time > self.MAX_RECORDING_TIME):\r\n                    break\r\n\r\n                if silence_time > self.MAX_RECORDING_TIME:\r\n                    self.go_to_sleep()\r\n                    break\r\n                \r\n                time.sleep(frame_length_ms / 10000)\r\n\r\n        except KeyboardInterrupt:\r\n            print(\"Done recording\")\r\n        except Exception as e:\r\n            print(str(e))\r\n            print(silence_time,self.RESPONSE_TIME,self.SLEEP_DELAY)\r\n            exit()\r\n\r\n        vad_stream.stop_stream()\r\n        vad_stream.close()\r\n        rec_stream.stop_stream()\r\n        rec_stream.close()\r\n        p.terminate()\r\n        return frames\r\n    \r\n\r\n@contextmanager\r\ndef suppress_stdout():\r\n    with open(os.devnull, \"w\") as devnull:\r\n        old_stdout = sys.stdout\r\n        sys.stdout = devnull\r\n        try:  \r\n            yield\r\n        finally:\r\n            sys.stdout = old_stdout\r\n\r\n"
  },
  {
    "path": "Assistant/__init__.py",
    "content": ""
  },
  {
    "path": "Assistant/get_audio.py",
    "content": "import whisper\r\nimport pyaudio\r\n\r\n# CHUNK = 1024\r\n# FORMAT = pyaudio.paInt16\r\n# CHANNELS = 2\r\n# RATE = 44100\r\n# SILENCE_THRESHOLD = 1500\r\n\r\n# convert audio content into text\r\ndef whisper_wav_to_text(audio_name, model=[], model_name=False, prior=None):\r\n    if isinstance(model_name, str):\r\n        print('loading model ', model_name)\r\n        model = whisper.load_model(model_name)\r\n\r\n    if model == []:\r\n        raise Exception(\"model cannot be unspecified\")\r\n\r\n    print('listening to ',audio_name,'...')\r\n    # load audio and pad/trim it to fit 30 seconds\r\n    audio = whisper.load_audio(audio_name)\r\n    audio = whisper.pad_or_trim(audio)\r\n\r\n    # make log-Mel spectrogram and move to the same device as the model\r\n    mel = whisper.log_mel_spectrogram(audio).to(model.device)\r\n\r\n    # detect the spoken language\r\n    try:\r\n        _, probs = model.detect_language(mel)\r\n        if not(prior is None):\r\n            filt_probs = {str(lan):probs.get(lan) for lan in prior}\r\n            probs = filt_probs\r\n        print(f\"Detected language: {max(probs, key=probs.get)}\")\r\n        detected_lang = str(max(probs, key=probs.get))\r\n\r\n        options = whisper.DecodingOptions(language=detected_lang)\r\n    except:\r\n        # model does not support multiple languages, default to English\r\n        print('language: en')\r\n        detected_lang = 'en'\r\n        options = whisper.DecodingOptions(language='en')\r\n    \r\n    result = whisper.decode(model, mel, options)\r\n\r\n    # print the recognized text\r\n    print('\\n[User]: '+ result.text)\r\n    return result.text, detected_lang\r\n\r\ndef get_device_channels():\r\n    p = pyaudio.PyAudio()\r\n    DEVICES = {}\r\n    for i in range(p.get_device_count()):\r\n        dev = p.get_device_info_by_index(i)\r\n        DEVICES[i] = dev['maxInputChannels']\r\n    return DEVICES\r\n\r\ndef detect_microphones():\r\n    p = pyaudio.PyAudio()\r\n    MICS = []\r\n    for i in range(p.get_device_count()):\r\n        dev = p.get_device_info_by_index(i)\r\n        if 'microphone' in dev['name'].lower():\r\n            MICS.append(i)\r\n    \r\n    return MICS if len(MICS)>=1 else [0]\r\n\r\ndef get_devices():\r\n    p = pyaudio.PyAudio()\r\n    DEV = []\r\n    for i in range(p.get_device_count()):\r\n        DEV.append( p.get_device_info_by_index(i))\r\n    return DEV\r\n        \r\n"
  },
  {
    "path": "Assistant/research_mode.py",
    "content": "# AGENT\nfrom langchain import OpenAI, LLMChain, PromptTemplate\nfrom langchain.llms import OpenAI\nfrom langchain.agents import Tool, AgentExecutor, ZeroShotAgent\nfrom langchain.memory import ConversationBufferMemory \nfrom langchain.agents import initialize_agent, load_tools\n\nfrom typing import Any\nfrom Assistant.semantic_scholar.agent_tools import *\nfrom Assistant.semantic_scholar.S2_tools import *\nfrom langchain.agents import AgentType\n\nclass ResearchAssistant:\n    def __init__(self, current_conversation, workspace = None, index_name = 'paperquestioning'):\n        self.current_workspace = workspace\n        self.index_name = index_name\n        self.query_engine = None\n        self.current_conversation = current_conversation\n\n        self.ans = []\n        self.docs = []\n\n        if 'workspaces' not in os.listdir(os.getcwd()):\n            os.mkdir('workspaces')\n            print('\\tinitializing Research Assistant but no workspace are available: begin a new search please')\n        elif len(os.listdir('workspaces'))>0:          \n            self.boot_workspace(workspace)\n\n        print('\\tResearch Assistant initialization done')      \n        self.agent = generateResearchAgent(self, k=1)\n\n    def boot_workspace(self, workspace):\n        if os.path.isdir(workspace):\n                print('\\tinitializing Research Assistant with Workspace directory: ', workspace)\n\n                # init vector store\n                init_attempts = 0\n                print(' ')\n                self.docs = load_workspace(workspace)\n                while True:\n                    init_attempts +=1\n\n                    try:\n                        self.query_engine, self.Index = llama_query_engine(self.docs,pinecone_index_name=self.index_name)\n                        break\n                    except Exception as e:\n                        print(f'initialization attempt {init_attempts} failed with exception {e}')\n                        time.sleep(2)\n                        if init_attempts<=3:continue\n        else: return None\n\n    # make wrappers to store results and info\n    def wrapper_find_papers_from_query(self, query):  \n        try:\n            res = find_paper_from_query(query, 20)\n            text = ''\n            for result in res:\n                if not result['isOpenAccess']:continue\n                text += result['title']+'; paperId: '+result['paperId']\n                self.ans.append( f\"{result['title']}: {result['paperId']}\")\n            if len(text)==0: return \"couldn't find any open access result\"\n            return text\n        except Exception as e:\n            return f'error: {e}'\n        \n\n    def load_pdf_to_pinecone(self, paths):\n        # read the pdf\n        if isinstance(paths, str): paths = [paths]\n        if isinstance(paths, list): raise Exception\n\n        for path in paths:\n            if not path.endswith('.pdf'): continue\n            content = readPDF(path)\n            doc = Document(\n                text = content,\n                doc_id = uuid.uuid4().hex\n            )\n            self.docs.append(doc)\n\n        # upload to Pinecone and synch index\n        self.Index.insert(document=doc)\n        # regresh the query engine\n        self.query_engine = self.Index.as_query_engine()\n        return\n\n    def PROTOCOL_begin_new_workspace(self, query):\n        # PRELIMINARY ASSESSMENT\n        prompt_template = \"Do you really need to search for something on internet?: {query} \\n Answer Yes or No\"\n        llm = OpenAI(temperature=0)\n        llm_chain = LLMChain(\n            llm=llm,\n            prompt=PromptTemplate.from_template(prompt_template)\n        )\n        assessment =  llm_chain.predict(query = query)\n        if 'yes' not in assessment.lower():\n            return 'what should be the topic of the workspace?'\n        \n        # GOING WITH IT\n        # preprocessing\n        prompt_template = \"Extract a search key from the following query: {query}\"\n        llm = OpenAI(temperature=0)\n        llm_chain = LLMChain(\n            llm=llm,\n            prompt=PromptTemplate.from_template(prompt_template)\n        )\n        # extraction of a query\n        search_query = llm_chain.predict(query = query)\n\n        # post processing of extracted search-query\n        search_query = re.sub('[^0-9a-zA-Z]', ' ', search_query.lower())\n        if \"search key\" in search_query: search_query=search_query.replace(\"search key\",\"\").strip()\n\n        print('SEARCH QUERY: ', search_query)\n        self.current_workspace = PaperSearchAndDownload(query=search_query)\n        self.boot_workspace(self.current_workspace)\n        return f'new workspace created at {self.current_workspace}'\n    \n    def wrapper_download_paper(self, id):\n        if 'cache' not in os.listdir(self.current_workspace): os.mkdir(os.path.join(self.current_workspace,'cache'))\n        ans = download_pdf_from_id(paperid= id, path= os.path.join(self.current_workspace, 'cache'))\n        update_workspace_dataframe(self.current_workspace, verbose = False)\n        pdf_paths = ans.split('\\n')\n        self.load_pdf_to_pinecone(pdf_paths)\n        return ans\n    \n    def wrapper_find_reccomendation(self, paperId):\n        return find_recommendations(paper=paperId, result_limit=5)\n    \n    def find_in_papers(self, query):\n        attempt =0\n        while True:\n            try: \n                attempt +=1\n                answer = self.query_engine.query(query)\n            except Exception as e:\n                if attempt<=3: continue\n                return str(e)\n            return answer\n        \n\n\n# generate a Zero Shot React Agent with memory that looks K interactions behind\ndef generateResearchAgent(RA:ResearchAssistant, k:int):\n\n    findpapers = Tool(\n        name='Find from query',\n        description='find a paper from a query, title and/or other information.',\n        func= RA.wrapper_find_papers_from_query\n    )\n\n    download_ID = Tool(\n        name='Download ID',\n        description='download a paper from paperId. Take as input a paperId',\n        func=RA.wrapper_download_paper\n    )\n\n\n    peek = Tool(\n        name='glimpse pdf',\n        description=\"get paper information if available, Take as input a paper title\",\n        func=glimpse_pdf\n    )\n\n    reccomend = Tool(\n        name='find reccomendations',\n        description='find similar paper from paperId. Take as input a paperId',\n        func=RA.wrapper_find_reccomendation\n    )\n    \n    tools = [findpapers, download_ID, peek, reccomend]\n    \n    memory = ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True, human_prefix='user', ai_prefix='assistant')\n\n    # need to work on a custom LangChain llm model \n    prefix = \"\"\"You are an assistant designed to browse scientific libraries, you have the following tools to complete the user requests:\"\"\"\n    suffix = \"\"\"Begin!\"\n\n    {chat_history}\n    Question: {input}\n    {agent_scratchpad}\"\"\"\n\n    prompt = ZeroShotAgent.create_prompt(\n        tools, \n        prefix=prefix, \n        suffix=suffix, \n        input_variables=[\"input\", \"chat_history\", \"agent_scratchpad\"]\n    )\n    llm_chain = LLMChain(llm=OpenAI(temperature=0), prompt=prompt)\n    agent = ZeroShotAgent(llm_chain=llm_chain, tools=tools, verbose=True)\n\n    # adding a window of memory:\n    memory = build_memory(chat_history = RA.current_conversation(), k=k)\n    \n    return AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True, memory=memory, early_stopping_method ='generate', max_iterations=20)\n\ndef build_memory(chat_history, k):\n    memory = ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True, human_prefix='user', ai_prefix='assistant')\n    k = min(k, len(chat_history)//2)\n\n    if k==0 :return memory\n\n    if chat_history[-k][\"role\"] != 'user': \n        print('refreshing memory warning - considering last interaction only')\n        k=1\n    try:\n        for i in range(-k*2-1, -1, 2):\n            input  = chat_history[i]['content']\n            output = chat_history[i+1]['content']\n            memory.save_context({\"input\":input}, {\"output\":output})\n    except:\n        memory = ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True, human_prefix='user', ai_prefix='assistant')\n    return memory\n"
  },
  {
    "path": "Assistant/semantic_scholar/S2_tools.py",
    "content": "import csv\nimport re\nfrom time import time\nimport requests\nimport dotenv\nimport aspose.pdf as ap\ndotenv.load_dotenv()\n\nimport argparse\nimport os\nfrom requests import Session\nfrom typing import Generator, Union\nimport subprocess\nimport urllib3\nimport json\nurllib3.disable_warnings()\n\nimport refy\n\nimport pdftitle\nfrom langchain.document_loaders import OnlinePDFLoader\nimport time\nimport arxiv\nfrom pymed import PubMed\n\nfrom .simple import Main\n\nRESULT_LIMIT = 10\nS2_API_KEY = os.environ['S2_API_KEY']\n\nPAPER_FIELDS = 'paperId,externalIds,title,authors,year,abstract,isOpenAccess,openAccessPdf,influentialCitationCount,citationStyles,tldr,venue,journal'\n\ndef get_paper(session: Session, paper_id: str, fields: str = 'paperId,title', **kwargs) -> dict:\n    params = {\n        'fields': fields,\n        **kwargs,\n    }\n    headers = {\n        'x-api-key': S2_API_KEY,\n    }\n\n    with session.get(f'https://api.semanticscholar.org/graph/v1/paper/{paper_id}', params=params, headers=headers) as response:\n        response.raise_for_status()\n        return response.json()\n\n\n\ndef find_paper_from_query(query, result_limit=RESULT_LIMIT):\n    papers = None\n    while papers is None:\n        try:\n            while True:\n                rsp = requests.get('https://api.semanticscholar.org/graph/v1/paper/search',\n                                        params={'query': query, 'limit': result_limit, 'fields': PAPER_FIELDS})\n                if rsp.status_code == 429: \n                    time.sleep(60)\n                    continue\n                break\n            \n            rsp.raise_for_status()\n            results = rsp.json()\n            total = results[\"total\"]\n            if not total:\n                print('No matches found. Please try another query.')\n                return 'No matches found. Please try another query.'\n\n            \n\n            papers = results['data']\n            filtered = []\n            for paper in (papers):\n                if paper['isOpenAccess']: filtered.append(paper)\n                if len(filtered)>=result_limit:break\n                    \n        except Exception as e:\n            print('\\n!!!!!!!!\\n')\n            print('ERROR OCCURRED: ',e)\n            print(rsp)\n            return rsp.status_code\n\n    print(f'Found {total} results. OpenAccess: {len(filtered)}.')\n    return papers\n\n# Finds papers which are similar to an exisiting one\ndef find_recommendations(paper, result_limit = RESULT_LIMIT):\n\n    print(f\"Looking for up to {result_limit} recommendations based on: {paper['title']}\")\n    rsp = requests.get(f\"https://api.semanticscholar.org/recommendations/v1/papers/forpaper/{paper['paperId']}\",\n                       params={'fields': 'title,url,isOpenAccess', 'limit': result_limit})\n    \n    rsp.raise_for_status()\n    results = rsp.json()\n    print_papers(results['recommendedPapers'])\n    return results['recommendedPapers']\n\ndef extract_title(path):\n    try:\n        title = pdftitle.get_title_from_file(path)\n\n        # remove non letter chars\n        title = re.sub('[^0-9a-zA-Z]', ' ', title)\n\n        # sometime the full text is returned instead of just the title. in that \n        # case use a coarse title detection\n        if len(title.split())>30: raise Exception\n        return title\n    except:\n        # ROUGH TITLE DETECTION\n        loader = OnlinePDFLoader(path)\n        data  = loader.load()\n        text_content = ''\n        formatted_content = data[0].page_content.replace('\\n\\n', ' ')\n        text_content+=formatted_content\n        \n        # CONSIDER THE FIRST 100 WORDS\n        # exclude single chars and remove puncts\n        title = ' '.join([word for word in text_content.split()[0:min(100,len(text_content.split()))] if (len(word)>1)])\n        title = re.sub('[^0-9a-zA-Z]', ' ', title)\n\n        # option 1: take the capital words contained in the first 100 words\n        title = ' '.join([word for word in title.split()[0:min(100,len(title.split()))] if (word.isupper() and len(word)>1)])\n        if len(title)>20: title = \" \".join(title.split()[0:10])\n\n        # option 2: take the first 10 words \n        else:\n            title = [word for word in text_content.split()[0:100] if (len(word)>1)]\n            title = ' '.join( title[0:min(10,len(title))])\n\n        print(' > generated title: ', title)\n        if title=='': return []\n        return title\n\ndef find_paper_online(path):\n    # if every word of the original title is present in the result, return it\n    def same_title(title1, title2):\n        return (word in title2.lower().split() for word in title1.lower().split())\n    \n    # OPEN AND EXTRACT PAPER TITLE\n    title = extract_title(path)\n\n    # LOOK IN OTHER PAPER DATABASES\n    # 1) scholar attempt\n    while True:\n        res = find_paper_from_query(title, result_limit=5)\n        if isinstance(res, int):\n            if res == 400: raise Exception\n            if res == 429: \n                time.sleep(60)\n            continue\n        break\n    if isinstance(res, list):\n        for article in res:\n            if same_title(title, article['title']): return article\n\n    # 2) arxiv attempt\n    search = arxiv.Search(\n        query=title,\n        id_list= [],\n        max_results=5,\n    )\n    res = search.results()\n    for article in res:\n        if same_title(article.title, title): \n            return  article._raw\n        \n    # 3) pubmed attempt\n    pubmed = PubMed(tool=\"MyTool\", email=\"my@email.address\")\n    res = pubmed.query(title, max_results=5)\n    \n    for article in res:\n        art = json.loads(article.toJSON())\n        if same_title(title, art['title']):return art\n\n    # noting found :(\n    return \n    \ndef print_papers(papers):\n    results = ''\n    for idx, paper in enumerate(papers):\n        results+= f\"{idx}  {paper['title']} {paper['url']}\"\n    return results\n\n\ndef chunks(items, chunk_size):\n    return [items[i:i + chunk_size] for i in range(0, len(items), chunk_size)]\n\n\ndef fetch_paper_batch(paperid: list):\n    req = {'ids': [f'{id}' for id in paperid]}\n    # https://api.semanticscholar.org/api-docs/graph#tag/Paper-Data/operation/post_graph_get_papers\n    \n    rsp = requests.post('https://api.semanticscholar.org/graph/v1/paper/batch',\n                        params={'fields': PAPER_FIELDS},\n                        json=req)\n    \n    if rsp.status_code != 200:\n        return f'Problem fetching {req}: ' + rsp.text\n    return rsp.json()\n\n\ndef download_pdf_from_id(paperid, path=os.getcwd()):\n    try:\n        res = Main(paper_ids=paperid, dir=path)\n        print(res)\n        return res\n    except:\n        return f'error with {paperid}'\n\n# add to PAPER.CSV semantic scolar entries from ID\ndef update_dataframe(incomplete, dest):\n    results = fetch_paper_batch(paperid= [item['paperId'] for item in incomplete])\n    if isinstance(results, str): \n        print(results) \n        print(f\" input: {incomplete}\")\n        return\n    pdf_des= dest\n    pdf_des = pdf_des[:-4] + '.pdf'\n    text = ''\n\n    with open(pdf_des, 'a+',encoding='utf-8') as f:\n        for paper in results:\n            try:\n                text += paper['title'].upper()+'\\n'\n                if 'tldr' in paper.keys():\n                    if paper['tldr'] is not None:text += paper['tldr']['text']+'\\n'\n                text += paper['abstract']+'\\n'\n                if 'summary' in paper.keys(): text += paper['summary']+'\\n'\n                text += '\\n\\n'\n            except:\n                pass\n\n    write_to_pdf(text, pdf_des)\n\n    count = 0\n\n    # Read existing entries from the CSV file\n    existing_entries = set()\n\n    isFile =  os.path.isfile(dest)\n    if not isFile:\n        with open(dest, 'w',encoding='utf-8') as fp:\n            csvfile = csv.DictWriter(fp, ['paperId', 'title', 'first_author', 'year', 'abstract','tldr','bibtex','influentialCitationCount','venue','journal','pages'])\n            csvfile.writeheader()\n    if isFile:\n        with open(dest, 'r',encoding='utf-8') as fp:\n            csvfile = csv.DictReader(fp)\n            for row in csvfile:\n                existing_entries.add(row['paperId'])\n\n    # Append new entries to the CSV file\n    with open(dest, 'a', encoding='utf-8') as fp:\n        csvfile = csv.DictWriter(fp, ['paperId', 'title', 'first_author', 'year', 'abstract','tldr','bibtex','influentialCitationCount','venue','journal','pages'])\n        \n        for paper in results:\n            paperId = paper['paperId']\n            if paperId in existing_entries:\n                continue  # Skip if the entry already exists\n            \n            paper_authors = paper.get('authors', [])\n            \n            journal_data = {}\n            if 'journal' in paper:\n                journal_data = paper.get('journal',[])\n            if journal_data is not None:\n                if 'name' not in journal_data: journal_data['name'] = ''\n                if 'pages' not in journal_data: journal_data['pages'] = ''\n\n            if paper.get('tldt',[]) != []:\n                tldr = paper['tldr']['text']\n            elif paper.get('summary',[]) != []:\n                tldr = paper['summary']\n            else:\n                tldr = paper['abstract']\n\n            csvfile.writerow({\n                'title': paper['title'],\n                'first_author': paper_authors[0]['name'] if paper_authors else '',\n                'year': paper['year'],\n                'abstract': paper['abstract'],\n                'paperId': paperId,\n                'tldr':tldr,\n                'bibtex':paper['citationStyles']['bibtex'] if paper['citationStyles']['bibtex'] else '',\n                'influentialCitationCount':paper['influentialCitationCount'],\n                'venue':paper['venue'],\n                'journal':journal_data['name'] if journal_data is not None else '',\n                'pages':journal_data['pages'] if journal_data is not None else '',\n            })\n            # except Exception as e:\n            #     print('error adding paper: ',e, '\\n',paper)\n            #     paper['title']\n            #     paper['year']\n            #     paper['abstract']\n            #     paper['citationStyles']['bibtex']\n                \n            #     if paper['tldr']: paper['tldr']\n            #     if paper_authors: paper_authors[0]['name']\n\n            #     quit()\n            count += 1\n\n    return f'Added {count} new results to {dest}'\n\n\n\n\ndef write_bib_file(csv_file, bib_file=None):\n    if bib_file is None:\n        bib_file = csv_file[:-4]+'.bib'\n    with open(csv_file, 'r', encoding='utf-8') as file:\n        reader = csv.DictReader(file)\n        with open(bib_file, 'w', encoding='utf-8') as output:\n            print(f'writing bibtex file at {bib_file}')\n            for row in reader:\n                bib_entry = create_bib_entry(row)\n                output.write(bib_entry + '\\n\\n')\n\ndef create_bib_entry(row):\n    paper_id = row['paperId']\n    title = row['title']\n    author = row['first_author']\n    year = row['year'].split('-')[0] # assuming format like 2023-03-24T15:46:10Z (arxiv use this)\n\n    journal_match = re.search(r\"journal\\s*=\\s*{([^}]*)}\", row['bibtex'])\n    if journal_match:\n        journal = journal_match.group(1)\n    else: journal = ''\n    pages_match = re.search(r\"pages\\s*=\\s*{([^}]*)}\", row['bibtex'])\n    if pages_match:\n        pages = pages_match.group(1)\n    else: pages = ''\n\n    abstract = replace_non_alphanumeric(row['abstract'])\n\n    # Generate the BibTeX entry\n    bib_entry = f\"@ARTICLE{{{paper_id},\\n\"\n    bib_entry += f\"  title     = \\\"{title}\\\",\\n\"\n    bib_entry += f\"  author    = \\\"{author}\\\",\\n\"\n    bib_entry += f\"  abstract  = \\\"{abstract}\\\",\\n\"\n    bib_entry += f\"  year      = {year},\\n\"\n    bib_entry += f\"  journal   = \\\"{journal}\\\",\\n\"\n    bib_entry += f\"  pages     = \\\"{pages}\\\"\\n\"\n    bib_entry += \"}\"\n\n    return bib_entry\n\n\ndef replace_non_alphanumeric(string, replacement=' '):\n    pattern = r'[^a-zA-Z0-9]'\n    replaced_string = re.sub(pattern, replacement, string)\n    return replaced_string\n\n\ndef refy_reccomend(bib_path, number=20):\n    d = refy.Recomender(\n        bib_path,            # path to your .bib file\n        n_days=30,               # fetch preprints from the last N days\n        html_path=os.path.join(os.path.join(bib_path.replace('\\\\results\\\\papers.bib',''),'refy_suggestions'),\"test.html\"),   # save results to a .csv (Optional)\n        N=number                 # number of recomended papers \n        )\n\n\ndef write_to_pdf(text, dest):\n    # Initialize document object\n    document = ap.Document()\n\n    # Add page\n    page = document.pages.add()\n\n    # Initialize textfragment object\n    text_fragment = ap.text.TextFragment(text)\n\n    # Add text fragment to new page\n    page.paragraphs.add(text_fragment)\n\n    # Save updated PDF\n    document.save(dest)"
  },
  {
    "path": "Assistant/semantic_scholar/__init__.py",
    "content": "#"
  },
  {
    "path": "Assistant/semantic_scholar/agent_tools.py",
    "content": "from contextlib import contextmanager\nimport uuid\nimport os\nimport tiktoken\n\nfrom . import S2_tools as scholar\n\nimport csv\nimport sys\nimport requests\n\n# pdf loader\nfrom langchain.document_loaders import OnlinePDFLoader\n\n## paper questioning tools\nfrom llama_index import Document\nfrom llama_index.vector_stores import PineconeVectorStore\nfrom llama_index import GPTVectorStoreIndex, StorageContext, ServiceContext\nfrom llama_index.embeddings.openai import OpenAIEmbedding\n \n\ndef PaperSearchAndDownload(query):\n    # make new workspace \n    if not os.path.exists( os.path.join(os.getcwd(),'workspaces') ): os.mkdir(os.path.join(os.getcwd(),'workspaces'))\n    workspace_dir_name = os.path.join(os.getcwd(),'workspaces',query.split()[0] + '_'+ str(uuid.uuid4().hex))\n    os.mkdir(workspace_dir_name)\n    os.mkdir(os.path.join(workspace_dir_name,'results'))\n    os.mkdir(os.path.join(workspace_dir_name,'refy_suggestions'))\n    os.environ['workspace'] = workspace_dir_name\n\n    # 1) search papers\n    print('  1) Searching base papers') \n    papers = scholar.find_paper_from_query(query, result_limit=10)\n    if len(papers == 0):\n        papers = scholar.find_paper_from_query(query, result_limit=50)\n    scholar.update_dataframe(incomplete=papers, dest=os.path.join(workspace_dir_name, 'results','papers.csv'))\n    delete_duplicates_from_csv(csv_file=os.path.join(workspace_dir_name, 'results','papers.csv'))\n\n    # 2) Cross-reference reccomendation system:\n    # a paper is reccomended if and only if it's related to more than one paper\n    print('\\n\\n 2) Expanding with Scholar reccomendations')\n    counts = {}\n    candidates = {}\n    for paper in papers:\n        guesses = scholar.find_recommendations(paper)\n\n        for guess in guesses:\n            if not guess['isOpenAccess']: continue\n\n            candidates[guess['title']] = guess\n            if guess['title'] not in counts.keys(): counts[guess['title']] = 1\n            else: counts[guess['title']] += 1\n    \n    # reccomend only papers that appeared more than once \n    reccomends = []\n    for key in counts:\n        if counts[key]>1: reccomends.append(candidates[key])\n\n    print(f'found {len(reccomends)} additional papers')\n    # update the csv\n    scholar.update_dataframe(incomplete= reccomends, dest=os.path.join(workspace_dir_name, 'results','papers.csv'))\n    delete_duplicates_from_csv(csv_file=os.path.join(workspace_dir_name, 'results','papers.csv'))\n\n    # download the papers (1/2)\n    print('downloading papers (1/2)')\n    with open(os.path.join(workspace_dir_name,'results','papers.csv'), 'r',encoding='utf-8') as fp:\n        csvfile = csv.DictReader(fp)  \n        scholar.download_pdf_from_id(\" \".join( row['paperId'] for row in csvfile), workspace_dir_name)\n    \n    scholar.write_bib_file(csv_file=os.path.join(workspace_dir_name,'results','papers.csv'), bib_file=os.path.join(workspace_dir_name,'results','papers.bib'))\n\n    # expand further with refy reccomendendation system\n    print('\\n\\n 3) Expanding with Refy reccomendendation system')\n    print('this might take a while...')\n    scholar.refy_reccomend(bib_path=os.path.join(workspace_dir_name,'results','papers.bib'))\n\n    with open(os.path.join(workspace_dir_name, 'refy_suggestions', 'test.csv'), 'r',encoding='utf-8') as fp:\n        csvfile = csv.DictReader(fp) \n        for row in csvfile:\n            title = scholar.replace_non_alphanumeric(row['title'])\n            title = title.replace(\" \",\"_\")\n\n            save_path = os.path.join(workspace_dir_name,'refy_suggestions',(title+'.pdf'))\n            try:\n                download_paper(url=row['url'], save_path=save_path)\n            except:\n                print(f'couldn t download {row}')\n\n    return f'{os.path.join(os.getcwd(), workspace_dir_name)}'\n\n\nimport urllib\ndef download_paper(url, save_path=f\"{uuid.uuid4().hex}.pdf\"):\n    success_string = f\"paper saved successfully at {os.path.join(os.path.abspath(save_path))}\"\n    if url.endswith('.pdf'):\n        urllib.request.urlretrieve(url, save_path)\n        return success_string\n    if 'doi' in url:\n        doi = paper_id = \"/\".join(url.split(\"/\")[-2:])\n        # Construct the Crossref API URL\n        print(doi)\n        doi_url = f\"https://doi.org/{doi}\"\n\n        # Send a GET request to the doi.org URL\n        response = requests.get(doi_url, allow_redirects=True)\n\n        # Check if the request was successful\n        if response.status_code == 200:\n            # Extract the final URL after redirection\n            url = response.url\n\n    if 'arxiv' in url:\n        # URL del paper su arXiv\n\n        # Ottieni l'ID del paper dall'URL\n        paper_id = url.split(\"/\")[-1]\n\n        # Costruisci l'URL di download del paper\n        pdf_url = f\"http://arxiv.org/pdf/{paper_id}.pdf\"\n\n        # Scarica il paper in formato PDF\n        urllib.request.urlretrieve(pdf_url, save_path)\n        return success_string\n\n    else:\n        if '/full' in url:\n            urllib.request.urlretrieve(url.replace('/full','/pdf'))\n            return success_string\n        if 'plos.org' in url:\n            final_url = url.replace('article?', 'article/file?')\n            urllib.request.urlretrieve(final_url, save_path)\n            return success_string\n    \n    return f'\\nfailed to download {url}'\n        \n\ndef download_bibtex_library(csv_path):\n    with open(csv_path, 'r',encoding='utf-8') as fp:\n        csvfile = csv.DictReader(fp) \n        for row in csvfile:\n            title = scholar.replace_non_alphanumeric(row['title'])\n            title = title.replace(\" \",\"-\")\n\n            save_path = os.path.join(os.path.join(csv_path, '..', title+'.pdf'))\n            try:\n                download_paper(url=row['url'], save_path=save_path)\n            except:\n                try:\n                    download_paper(url=row['url']+'.pdf', save_path=save_path)\n                except:\n                    print(f'couldn t download {row}')\n\ndef generate_chunks(text, CHUNK_LENGTH = 4000):\n    enc = tiktoken.encoding_for_model(\"gpt-4\")\n    tokens = enc.encode(text)\n    token_chunks = [tokens[i:i + CHUNK_LENGTH] for i in range(0, len(tokens), CHUNK_LENGTH)]\n\n    word_chunks = [enc.decode(chunk) for chunk in token_chunks]\n    return word_chunks\n\n\nfrom langchain.vectorstores import Chroma, Pinecone\nfrom langchain.embeddings.openai import OpenAIEmbeddings\nimport pinecone\n\nimport langid\nimport time\n\n# def process_pdf_folder(folder_path):\n#     if not os.path.exists(folder_path):\n#         return 'the folder does not exist, check your spelling'\n\n#     for item in os.listdir(folder_path):\n#         if not item.endswith('.pdf'):continue\n        \n#         with open(os.path.join(folder_path,'SUMMARY.txt'), 'a', encoding='UTF-8') as write_file:\n#             write_file.write(item)\n#             write_file.write(\"\\n\\n\\n\")\n#             txt = summarize_pdf(item, model='Vicuna')\n#             try:\n#                 write_file.write(txt)\n#             except:\n#                 print(txt)\n    \n#     with open(os.path.join(folder_path,'SUMMARY.txt'), 'r', encoding='UTF-8') as read_file:\n#         return read_file.read()\n\n\n\n# # def summarize_pdf(pdf_path, model= None):\n#     text = readPDF(pdf_path)\n\n#     # according to the TLDR Model, consider smaller chunks\n#     text_chunks = generate_chunks(text, 700)\n\n#     if model is not None:\n#         summarizer = LocalSearchEngine(tldr_model=model)\n    \n#     summary=''\n#     for chunk in text_chunks:\n#         summary += summarizer.tldr(chunk)\n\n#     return summary\n\ndef get_result_path(path, exclude = []):\n    for item in os.listdir(path):\n        if item == 'papers.csv':\n            return os.path.join(path, item)\n        if os.path.isdir(os.path.join(path, item)) and item not in exclude: \n            res = get_result_path(os.path.join(path, item))\n            if res: return res\n    return \n\ndef get_workspace_titles(workspace_name):\n    csv_file_path = get_result_path(workspace_name)\n    papers_available = []\n    with open(csv_file_path, 'r', encoding='utf-8') as file:\n        csv_file = csv.DictReader(file)\n        for row in csv_file:\n            papers_available.append(row['title'])\n    return papers_available\n\nimport re\ndef same_title(title1, title2):\n    try:\n        title1 = re.sub(r'[^a-zA-Z]', ' ', title1)\n        title2 = re.sub(r'[^a-zA-Z]', ' ', title2)\n    except:\n        return False\n    words1 = set(title1.lower().split())\n    words2 = set(title2.lower().split())\n    return words1 == words2 or words1 <= words2 or words1 >= words2\n\n\ndef glimpse_pdf(title):\n    # find papers.csv in workspace\n    \n    for workspace_name in os.listdir('workspaces'):\n        csv_file_path = get_result_path(workspace_name)\n        if csv_file_path is None: return 'no paper found'\n        \n        with open(csv_file_path, 'r', encoding='utf-8') as file:\n            csv_file = csv.DictReader(file)\n            for row in csv_file:\n                if same_title(row['title'], title): return f\"{row['title']}, paperId: {row['paperId']}, summary: {row['abstract']}\"\n    \n    return f'\\nno paper found with title {title}'\n    \ndef count_tokens(text):\n    enc = tiktoken.encoding_for_model(\"gpt-4\")\n    tokens = enc.encode(text)\n    return len(tokens)\n\ndef readPDF(pdf_path):\n    loader = OnlinePDFLoader(pdf_path)\n    data  = loader.load()\n    text_content = ''\n\n    for page in data:\n        formatted_content = page.page_content.replace('\\n\\n', ' ')\n        text_content+=formatted_content\n    \n    return text_content\n\ndef get_pdf_path(dir, exclude=[]):\n    paths = []\n    for item in os.listdir(dir):\n        itempath = os.path.join(dir,item)\n        if item.endswith('.pdf'): paths.append(itempath)\n        if os.path.isdir(itempath)and item not in exclude: \n            subpaths = get_pdf_path(itempath)\n            for i in subpaths: paths.append(i)\n    return paths\n\ndef delete_duplicates_from_csv(csv_file):\n    print('verifying duplicates...') \n    to_delete = []\n    def delete_csv_row_by_title(csv_file, title):\n        # Read the CSV file and store rows in a list\n        with open(csv_file, 'r',encoding='UTF-8') as file:\n            reader = csv.DictReader(file)\n            rows = list(reader)\n\n        # Find the row index with the matching title\n        row_index = None\n        for index, row in enumerate(rows):\n            if row['title'] == title:\n                row_index = index\n                break\n\n        # If no matching title is found, return\n        if row_index is None:\n            print(f\"No row with title '{title}' found.\")\n            return\n\n        # Remove the row from the list\n        del rows[row_index]\n\n        # Write the updated rows back to the CSV file\n        with open(csv_file, 'w', newline='',encoding='UTF-8') as file:\n            fieldnames = reader.fieldnames\n            writer = csv.DictWriter(file, fieldnames=fieldnames)\n            writer.writeheader()\n            writer.writerows(rows)\n    \n    \n    with open(csv_file, 'r', encoding='UTF-8') as file:\n        DELETED = 0\n        reader = csv.DictReader(file)\n        rows = list(reader)\n        entries = set()\n        for row in rows:\n            if row['title']=='' or row['title'] is None: continue\n            if row['title'] not in entries:entries.add(row['title'])\n            else:\n                DELETED+=1\n                to_delete.append(row['title'])\n                \n        for title in to_delete: delete_csv_row_by_title(csv_file, title=title)\n    print(f\"Deleted {DELETED} duplicates\")\n    return\n            \n        \n\ndef update_workspace_dataframe(workspace, verbose = True):\n    ADDED = 0\n    # find results.csv \n    csv_path = get_result_path(workspace)\n\n    # get titles in csv\n    titles = get_workspace_titles(workspace)\n\n    # get local papers path\n    paths = get_pdf_path(workspace, exclude='refy_suggestions')\n\n    # adding new to csv:\n    for path in paths:\n        \n        exists = False\n\n        # extract the title from the local paper\n        title = scholar.extract_title(path)\n\n        for t in titles:\n            if same_title(t,title): exists = True\n            \n        \n        # add it to dataframe if it was not found on the DF\n        if not exists:\n            if verbose: print(f\"\\nnew paper detected: {title}\")\n            # find it with online\n            paper = scholar.find_paper_online(path)\n            if paper : \n                if verbose: print(f\"\\t---> best match found online: {paper['title']} \" )\n                for t in titles:\n                    if same_title(paper['title'], title): \n                        if verbose: print(f\"\\t    this paper is already present in the dataframe. skipping\")\n            else: \n                if verbose: print(path,   '-x-> no match found')\n                continue\n            with open(csv_path, 'a', encoding='utf-8') as fp:\n                areYouSure = True\n                for t in titles:\n                    if same_title(t,paper['title']): areYouSure =False\n                if not areYouSure:\n                    if verbose: print(f\"double check revealed that the paper is already in the dataframe. Skipping\") \n                    continue\n                if verbose: print(f\"\\t---> adding {paper['title']}\")\n                ADDED +=1\n                paper_authors = paper.get('authors', [])\n                journal_data = {}\n                if 'journal' in paper:\n                    journal_data = paper.get('journal',[])\n                if journal_data is not None:\n                    if 'name' not in journal_data: journal_data['name'] = ''\n                    if 'pages' not in journal_data: journal_data['pages'] = ''\n\n                if paper.get('tldr',[]) != []:tldr = paper['tldr']['text']\n                elif paper.get('summary',[]) != []:tldr = paper['summary']\n                elif 'abstract' in paper:tldr = paper['abstract']\n                else: tldr = 'No summary available'\n\n                if 'year' in paper:\n                    year = paper['year']\n                elif 'updated' in paper:year = paper['updated']\n                else: year = ''\n\n                if 'citationStyles' in paper:\n                    if 'bibtex' in paper['citationStyles']: citStyle = paper['citationStyles']['bibtex'] \n                    else: citStyle = paper['citationStyles'][0]\n                else: citStyle = ''\n\n                csvfile = csv.DictWriter(fp, ['paperId', 'title', 'first_author', 'year', 'abstract','tldr','bibtex','influentialCitationCount','venue','journal','pages'])\n                try:\n                    csvfile.writerow({\n                        'title': paper['title'],\n                        'first_author': paper_authors[0]['name'] if paper_authors else '',\n                        'year': year,\n                        'abstract': paper['abstract'] if 'abstract' in paper else '',\n                        'paperId': paper['paperId'] if 'paperId' in paper else '',\n                        'tldr':tldr,\n                        'bibtex':citStyle,\n                        'influentialCitationCount': paper['influentialCitationCount'] if 'influentialCitationCount' in paper else '0',\n                        'venue':paper['venue'] if 'venue' in paper else '',\n                        'journal':journal_data['name'] if journal_data is not None else '',\n                        'pages':journal_data['pages'] if journal_data is not None else '',\n                    })      \n                except Exception as e: \n                    if verbose: print('could not add ', title, '\\n',e)\n                # delete dupes if present\n\n    if verbose: print(f\"\\n\\nCSV UPDATE: Added {ADDED} new papers\")\n    \n    # clean form dupes\n    delete_duplicates_from_csv(csv_path)\n    \n    # update bib\n    scholar.write_bib_file(csv_path)\n    return\n    \n\n\ndef load_workspace(folderdir):\n    docs =[]\n    \n    for item in os.listdir(folderdir):\n        if item.endswith('.pdf'):\n            print(f'   > loading {item}')\n            with suppress_stdout():\n                content = readPDF(os.path.join(folderdir, item))\n                docs.append(Document(\n                    text = content,\n                    doc_id = uuid.uuid4().hex\n                ))\n        \n        if item =='.'or item =='..':continue\n        if os.path.isdir( os.path.join(folderdir,item) ):\n            sub_docs = load_workspace(os.path.join(folderdir,item))\n            for doc in sub_docs:\n                docs.append(doc)\n        \n    return docs\n\n# List paths of all pdf files in a folder\ndef list_workspace_elements(folderdir):\n    docs =[] \n    for item in os.listdir(folderdir):\n        if item.endswith('.pdf'):\n            docs.append(rf\"{os.path.join(folderdir,item)}\")\n        \n        if item =='.'or item =='..':continue\n        if os.path.isdir( os.path.join(folderdir,item) ):\n            sub_docs = list_workspace_elements(os.path.join(folderdir,item))\n            for doc in sub_docs:\n                docs.append(doc)\n    return docs\n\ndef llama_query_engine(docs:list, pinecone_index_name:str):\n    pinecone.init(\n        api_key= os.environ['PINECONE_API_KEY'],\n        environment= os.environ['PINECONE_API_ENV']\n    )\n    \n    # Find the pinecone index\n    if pinecone_index_name not in pinecone.list_indexes():\n        # we create a new index\n        pinecone.create_index(\n            name=pinecone_index_name,\n            metric='dotproduct',\n            dimension=1536  # 1536 dim of text-embedding-ada-002\n        )\n\n    index = pinecone.Index(pinecone_index_name)\n    \n    # init it\n    vector_store = PineconeVectorStore(pinecone_index=index)\n    time.sleep(1)\n\n    # setup our storage (vector db)\n    storage_context = StorageContext.from_defaults(\n        vector_store=vector_store\n    )\n\n    embed_model = OpenAIEmbedding(model='text-embedding-ada-002', embed_batch_size=100)\n    service_context = ServiceContext.from_defaults(embed_model=embed_model)\n\n    \n    # populate the vector store\n    LamaIndex = GPTVectorStoreIndex.from_documents(\n        docs, storage_context=storage_context,\n        service_context=service_context\n    )\n\n    print('PINECONE Vector Index initialized:\\n',index.describe_index_stats())\n\n    # init the query engine\n    query_engine = LamaIndex.as_query_engine()\n    \n    return query_engine, LamaIndex\n\n\n\n@contextmanager\ndef suppress_stdout():\n    with open(os.devnull, \"w\") as devnull:\n        old_stdout = sys.stdout\n        sys.stdout = devnull\n        try:  \n            yield\n        finally:\n            sys.stdout = old_stdout"
  },
  {
    "path": "Assistant/semantic_scholar/simple.py",
    "content": "#!/usr/bin/env python3\nimport dotenv\ndotenv.load_dotenv()\nimport re\nimport argparse\nimport os\nfrom requests import Session\nfrom typing import Generator, Union\n\nimport urllib3\nurllib3.disable_warnings()\n\nS2_API_KEY = os.environ['S2_API_KEY']\n\n\ndef get_paper(session: Session, paper_id: str, fields: str = 'paperId,title', **kwargs) -> dict:\n    params = {\n        'fields': fields,\n        **kwargs,\n    }\n    headers = {\n        'x-api-key': S2_API_KEY,\n    }\n\n    with session.get(f'https://api.semanticscholar.org/graph/v1/paper/{paper_id}', params=params, headers=headers) as response:\n        response.raise_for_status()\n        return response.json()\n\n\ndef download_pdf(session: Session, url: str, path: str, user_agent: str = 'requests/2.0.0'):\n    # send a user-agent to avoid server error\n    headers = {\n        'user-agent': user_agent,\n    }\n\n    # stream the response to avoid downloading the entire file into memory\n    with session.get(url, headers=headers, stream=True, verify=False) as response:\n        # check if the request was successful\n        response.raise_for_status()\n\n        if response.headers['content-type'] != 'application/pdf':\n            raise Exception('The response is not a pdf')\n\n        with open(path, 'wb') as f:\n            # write the response to the file, chunk_size bytes at a time\n            for chunk in response.iter_content(chunk_size=8192):\n                f.write(chunk)\n\n\ndef download_paper(session: Session, paper_id: str, directory: str = 'papers', user_agent: str = 'requests/2.0.0') -> Union[str, None]:\n    try:\n        directory = os.environ['workspace']\n    except:\n        pass\n    \n    paper = get_paper(session, paper_id, fields='paperId,title,isOpenAccess,openAccessPdf')\n\n    # check if the paper is open access\n    if not paper['isOpenAccess']:\n        return None\n\n    paperId: str =re.sub(r'\\W+', '', paper['title']).encode(\"utf-8\").decode(\"utf-8\")\n    pdf_url: str = paper['openAccessPdf']['url']\n    pdf_path = os.path.join(directory, f'{paperId}.pdf')\n    if os.path.isfile(pdf_path):\n        return None\n\n    # create the directory if it doesn't exist\n    os.makedirs(directory, exist_ok=True)\n\n    # check if the pdf has already been downloaded\n    if not os.path.exists(pdf_path):\n        download_pdf(session, pdf_url, pdf_path, user_agent=user_agent)\n\n    return pdf_path\n\n\ndef download_papers(paper_ids: list[str], directory: str = 'papers', user_agent: str = 'requests/2.0.0') -> Generator[tuple[str, Union[str, None, Exception]], None, None]:\n    # use a session to reuse the same TCP connection\n    with Session() as session:\n        for paper_id in paper_ids:\n            try:\n                yield paper_id, download_paper(session, paper_id, directory=directory, user_agent=user_agent)\n            except Exception as e:\n                yield paper_id, e\n\n\ndef main(args: argparse.Namespace) -> None:\n    for paper_id, result in download_papers(args.paper_ids, directory=args.directory, user_agent=args.user_agent):\n        if isinstance(result, Exception):\n            return f\"Failed to download '{paper_id}': {type(result).__name__}: {result}\"\n        elif result is None:\n            return f\"'{paper_id}' is not open access\"\n        else:\n            return f\"Downloaded '{paper_id}' to '{result}'\"\n\n\nif __name__ == '__main__':\n    parser = argparse.ArgumentParser()\n    parser.add_argument('--directory', '-d', default='papers')\n    parser.add_argument('--user-agent', '-u', default='requests/2.0.0')\n    parser.add_argument('paper_ids', nargs='+', default=[])\n    args = parser.parse_args()\n    main(args)\n\ndef Main(paper_ids=[], dir='papers', user_agent = 'requests/2.0.0', ):\n    outcome = ''\n    if isinstance(paper_ids, str):\n        if ',' in paper_ids: \n            paper_ids = paper_ids.split(',')\n        else:\n            paper_ids = paper_ids.split()\n            \n        paper_ids = (id.strip() for id in paper_ids)\n    for paper_id, result in download_papers(paper_ids, directory=dir, user_agent=user_agent):\n        if isinstance(result, Exception):\n            outcome += f\"Failed to download '{paper_id}': {type(result).__name__}: {result}\\n\"\n        elif result is None:\n            outcome += f\"couldn't download '{paper_id} because it is not open access\\n\"\n        else:\n            outcome += f\"{result}\\n\"\n    return outcome"
  },
  {
    "path": "Assistant/tools.py",
    "content": "# imports for Local Search Engine\nimport openai\nimport os\nimport pandas as pd\nimport numpy as np\nfrom openai.embeddings_utils import distances_from_embeddings, cosine_similarity\nfrom tqdm import tqdm\nimport ast\n\nfrom . import webui\n\n# import for Translator\nimport regex as re\nimport langid\nfrom textblob import TextBlob\ntry: import translators as ts\nexcept: print('could not import translators package')\nimport argostranslate.package\nimport argostranslate.translate\n\nimport math\nimport time \nimport collections\n\n\"\"\"\nAssistantChat: dictionary on steroids.\n\"\"\"\nclass AssistantChat(collections.MutableSequence):\n    def __init__(self, begin:list, *args):\n        self.body = begin\n        self.filename= None\n        self.extend(list(args))\n    \n    def is_saved(self):\n        return True if self.filename != None else False\n    \n    def insert(self, i, v):\n        self.body.insert(i, v)\n\n    def append(self, item):\n        self.body.append(item)\n\n    def __call__(self):\n        return self.body\n\n    def __len__(self): return len(self.body)\n\n    def __getitem__(self, i): return self.body[i]\n\n    def __delitem__(self, i): del self.body[i]\n\n    def __setitem__(self, i, v):\n        self.body[i] = v\n\n    def __str__(self):\n        return str(self.body)\n\n\n\"\"\"\nTranslator: \nperforms basic translation opration using ChatGPT. \nSetting temperature to 0 allows better raw results\n\"\"\"\n\n\"\"\"\noptions:\n - gpt-3.5-turbo: reasonably fast, online, requires openai credit usage \n - translators 5.6.3 lib: online, excellent, long lags might occcur\n - [default] argostranslator: fast,  offline \n\"\"\"\n\nclass Translator:\n    def __init__(self, model=\"argostranslator\", **kwargs):\n        POSSIBLE_MODELS = ['argostranslator','gpt-3.5-turbo', 'translators']\n        if model not in POSSIBLE_MODELS:\n            raise Exception('this Translation model is not available')\n        \n        self.DEFAULT_CHAT = [{\"role\": \"system\", \n                    \"content\": \"You are a translator. You recieve text and target language as inputs and translate the text to the target language\"}]\n        self.body = None\n        self.model = model\n        langs = kwargs['translator_languages']\n        self.languages = langs\n\n        # Download and install Argos Translate packages\n        argostranslate.package.update_package_index()\n        available_packages = argostranslate.package.get_available_packages()\n        \n        langid.set_languages(langs)\n\n        for i in range(len(langs)):\n            for j in range(len(langs)):\n                if langs[i]==langs[j]: continue\n                \n                try:\n                    package_to_install = next(\n                        filter(\n                            lambda x: x.from_code == langs[i] and x.to_code == langs[j], available_packages\n                        )\n                    )\n                except:\n                    print(f'failed to add {langs[i]} => {langs[j]}')\n                print(f'downloading Argos Translate Language packages...')\n                try:\n                    argostranslate.package.install_from_path(package_to_install.download())\n                except:\n                    pass\n        \n\n    def translate(self, input, to_language, from_language=None):\n        if from_language == to_language: return input\n\n        if from_language == None:\n            from_language = langid.classify(input)[0]\n            \n        if self.model==\"gpt-3.5-turbo\":\n            self.body = self.DEFAULT_CHAT\n            self.body.append({\"role\":\"user\", \"content\":f\"translate in {to_language}:'{input}'\"})\n            try:\n                API_response = openai.ChatCompletion.create(\n                        model=self.model,\n                        temperature=0,  \n                        messages=self.body)\n                \n            except Exception as e:\n                print(f\"couldn't translate {self.body[-1]}\")\n                print(e)\n                return input\n            return API_response['choices'][0]['message']['content']\n        \n        if self.model=='translators':\n            try:\n                res = ts.translate_text(input, translator='google', to_language=to_language, from_language=from_language)\n            except:\n                res = input\n                self.model = 'argostranslator'\n                print('translation using translators switching to argostranslate')\n                \n            return res\n        \n        if self.model == 'argostranslator':\n            try:\n                res = argostranslate.translate.translate(input, from_code=from_language, to_code=to_language)\n            except:\n                print(f\"translation using argostranslate from: {from_language} - to -> {to_language} Failed\")\n                print(input)\n                res= input\n            return res\n\n    \n\n\"\"\"\nLocalSearchEngine:\n - Looks for files in a foder;\n - extracts information;\n - create high value contents that allow for accurate search;\n\nto be implemented:\n - extend reserarch to .pdf and .jpeg (w/ ChatGPT4)\n    - extends also to videos;\n    - extends also to scientific papers;\n\"\"\"\n\nclass LocalSearchEngine:\n    def __init__(self, \n                 embed_model = \"text-embedding-ada-002\", \n                 tldr_model = \"gpt-3.5-turbo\",\n                 translator_model = \"argostranslator\",\n                 translator_languages = ['en','it','es'],\n                 default_dir = os.path.realpath(os.path.join(os.getcwd(),'saved_chats')),\n                 irrelevancy_th=0.8):\n        \n        self.translate_engine = Translator(model=translator_model, translator_languages=translator_languages)\n        self.tldr_model  = tldr_model\n        self.embed_model = embed_model\n        self.default_dir = default_dir\n        self.irrelevancy_threshold = irrelevancy_th\n    \n    def compute_similarity(self, key, text):\n        if type(key)==str:  key_embedding = self.compute_embeds(key)\n        else: key_embedding = key\n        \n        if type(text)==str: query_embedding =self.compute_embeds(text)\n        else: query_embedding = text\n\n        similarity = cosine_similarity(key_embedding, query_embedding)\n        return similarity\n    \n\n    def accurate_search(self, key, path=None, n=-1, from_csv=False):\n        if path is None:\n            path = self.default_dir\n\n        print('\\n')\n        if 'DATAFRAME.csv' not in os.listdir(path):\n            print('> > DATAFRAME.csv not detected building a new one')\n            pd.DataFrame({'file_names':['DATAFRAME.csv'], 'similarity':[0],\"tags\":[None]}).to_csv(os.path.join(path, 'DATAFRAME.csv'))\n\n        if isinstance(key, list) or isinstance(key, tuple):\n            key = \" \".join(key)\n        \n        # USE EXISTING DATAFRAME TO MAKE SEARCH FASTER (skip tag generation)\n        if from_csv:\n            DataFrame = pd.read_csv(os.path.join(path,'DATAFRAME.csv'))\n            fnames = DataFrame[\"file_names\"]\n            tags = DataFrame[\"tags\"]\n            embeds = DataFrame[\"embeddings\"]\n\n            if len(fnames)!=len(os.listdir(path)):\n                print('> dataset not updated. Updating it now...')\n                \n                self.produce_folder_tags() ### I should add a parameter to specify HugginFaceHub (free) embeddings or OpenAI ones ($)\n        \n        print('> Analyzing DataFrame:')\n\n        results = []\n        topics = []\n        \n        key_embed = {}\n        for lang in self.translate_engine.languages:\n            transl_key = self.translate_engine.translate(input=key, to_language=langid.classify(lang)[0], from_language=langid.classify(key)[0])\n            print(f'> > computing key embedding in {lang} language')\n            key_embed[lang]= self.compute_embeds(transl_key)\n\n        for i in tqdm(range(len(fnames))):\n            if not(fnames[i].endswith('.txt')):\n                results.append(0)\n                topics.append('None')\n                continue\n            \n            # extract tags associated to the file\n            file_tags = tags[i]\n            topics.append(file_tags)\n\n            # extract and parse the saved embeddings\n            file_embeds = ast.literal_eval( embeds[i] ) # from \"[a, b, c,]\" to [a, b, c]\n\n            # take the key embedding from the same language (more accurate) \n            key_embedding = key_embed[langid.classify(file_tags)[0]]\n            \n            done=False\n            while not(done):\n                try:\n                    relevance = self.compute_similarity(file_embeds, key_embedding)\n                    done=True\n\n                except Exception as e:\n                    print(e)\n\n            results.append(relevance)\n\n        if n==-1: n=len(fnames)\n        df = pd.DataFrame({'file_names':fnames, 'similarity':results,\"tags\":topics})\n        df = df.sort_values(by='similarity', ascending=False)\n        df = df.reset_index(drop=True)\n\n        return df.head(n)\n\n\n    def produce_folder_tags(self, path=None):\n        if path is None:\n            path = self.default_dir\n        \n        if ('DATAFRAME.csv' in os.listdir(path)):\n            print('> > DataFrame existing')\n        else:\n            print('> > Creating empty DataFrame')\n            pd.DataFrame(columns=['file_names', 'tags', 'embeddings']).to_csv(os.path.join(path,'DATAFRAME.csv'))\n\n        existing_df = pd.read_csv(os.path.join(path, 'DATAFRAME.csv'))\n\n        fnames = os.listdir(path)\n        embeds = []\n        topics = []\n        n_updates = 0\n\n        for filename in fnames:\n            # process text files only\n            if not(filename.endswith('.txt')):\n                embeds.append(math.nan)\n                topics.append('NaN')\n                continue\n\n            # don't repeat calculation if the file has already been processed \n            has_tags = len(existing_df['tags'][existing_df[\"file_names\"]==filename])>=1\n            \n            try:\n                has_embeds = len(existing_df['embeddings'][existing_df[\"file_names\"]==filename].to_list()[0]) >5 \n            except:\n                has_embeds = False\n        \n            f = open(os.path.join(path,filename), 'r')\n            text = f.read()\n            if count_tokens(text)>4096:\n                # keep 2000 words only\n                text = \" \".join(text.split()[0:2000])\n\n            if has_tags:\n                tags= existing_df['tags'][existing_df[\"file_names\"]==filename].to_list()[0]\n                topics.append(tags)\n            else:\n                n_updates +=1\n                print(f'> > {filename}: extracting topics')\n                done= False\n                while not(done):\n                    try:    \n                        tags = self.extract_tags(text)\n                        done= True\n                    except:\n                        print('> > system overloaded, waiting 5 sec')\n                        time.sleep(5)\n\n                topics.append(tags)\n\n            if has_embeds:\n                embeds.append(existing_df['embeddings'][existing_df[\"file_names\"]==filename].to_list()[0])\n            else:\n                n_updates +=1\n                print(f'> > {filename}: processing embeddings')\n                done = False\n                while not(done):\n                    try:            \n                        embedding = self.compute_embeds(tags)\n                        done= True\n                    except:\n                        print('> > system overloaded, waiting 5 sec')\n                        time.sleep(5)\n                embeds.append(embedding)\n\n        df = pd.DataFrame({'file_names':fnames, 'tags':topics, 'embeddings':embeds})\n        df.to_csv(os.path.join(path,'DATAFRAME.csv'), index=False)\n        df = df.reset_index(drop=True)\n        print(f\"> > # UPDATES applied:{n_updates}\")\n        return df\n\n    def extract_tags(self, text):\n        text = text.split('user:')\n        text = \"\".join(text[1:])\n\n        chat = [{\"role\": \"system\", \n                    \"content\": \"You recieve text and extract up to 10 different topic covered in the text. You output the topics separated by a comma (,)\"}]\n        chat.append({\"role\": \"user\", \"content\":f\"extract tags:{text}\"})\n        API_response = openai.ChatCompletion.create(\n                model=self.tldr_model,\n                temperature=0,\n                messages=chat)\n        \n        output = API_response['choices'][0]['message']['content']\n        if ':' in output:\n            output = output.split(':')\n            output = \"\".join(output[1:])\n        return output\n\n    # ADD Free alternative (Huggingface Embeds)\n    def compute_embeds(self, words):\n        return openai.Embedding.create(input=words, engine=self.embed_model)['data'][0]['embedding']\n\n    def DaVinci_tldr(self, text):\n        response = openai.Completion.create(\n            model=\"text-davinci-003\",\n            prompt=f\"{text}\\n\\nTl;dr\",\n            temperature=0,\n            max_tokens=200,\n            top_p=1.0,\n            frequency_penalty=0.0,\n            presence_penalty=0.0\n        )\n        return response['choices'][0][\"text\"]\n    \n    def tldr(self, text, to_language=None, with_model = ''):\n        if self.tldr_model == 'gpt-3.5-turbo'or with_model=='gpt-3.5-turbo':\n            text = text.replace('\\n',' ')\n            if to_language != None:\n                context =f'tldr in {to_language}:'\n                CHAT = [{\"role\": \"system\", \"content\":context},\n                        {\"role\": \"user\", \"content\":f\"'{text}'\"}]\n                \n                response = openai.ChatCompletion.create(\n                            model=\"gpt-3.5-turbo\",\n                            temperature=0,\n                            max_tokens=200,\n                            messages=CHAT)\n                \n                try:\n                    return response['choices'][0]['message']['content']\n                except:\n                    pass\n\n            else: \n                return self.DaVinci_tldr(text)\n            \n        \n        if self.tldr_model == 'Vicuna' or with_model=='Vicuna':\n            try:\n                webui.set_text_gen_params(temperature=0.1)\n                result = webui.oobabooga_textgen(prompt=f'Text Summarizer [Question]: summarize the following text: {text}\\n[Answer]:')\n                postprocessed = webui.post_process(result)\n                return postprocessed\n            except IndexError as e:\n                return result\n            except Exception as e:\n                print(e)\n                return ''\n    \n\"\"\"\nOnlineSearchEngine:\nto be implemented:\n - allows to extract content from the internet with http requests;\n - provide context to the VirtualAssistant\n - find a way to trigger online search\n\"\"\"\n\nclass OnlineSearchEngine:\n    # work in progress\n    pass\n\n\"\"\"\nMISCELLANEOUS FUNCTIONS\n\"\"\"\ndef count_tokens(vCountTokenStr):\n    # Tokenize the input string\n    blob = TextBlob(vCountTokenStr)\n    tokens = blob.words\n\n    # Count the number of tokens\n    num_tokens = len(tokens)\n    return num_tokens\n\n\ndef parse_conversation(string_chat):\n    split1_chat = string_chat.split('user:')\n\n    rebuilt = []\n\n    for item in split1_chat:\n        if 'system:' in item:\n            rebuilt.append({\"role\":\"system\", \"content\":f\"{item.split('ststem:')[-1]}\"})\n        if 'assistant:' in item:\n            spl_item = item.split(\"assistant:\")\n            rebuilt.append({\"role\":\"user\", \"content\":f\"{spl_item.pop(0)}\"})\n            \n            while len(spl_item)>=1:\n                rebuilt.append({\"role\":\"assistant\", \"content\":f\"{spl_item.pop(0)}\"})\n    \n    return rebuilt\n\ndef take_last_k_interactions(chat, max_tokens=4000):\n    n_tokens = 0\n    interactions = []\n\n    for item in chat:\n        n_tokens += count_tokens(item['content'])\n        if n_tokens>= max_tokens:\n            return interactions\n        interactions.append(item)"
  },
  {
    "path": "Assistant/voice.py",
    "content": "# imports\nimport pyttsx3\nfrom ibm_watson import TextToSpeechV1\nfrom ibm_cloud_sdk_core.authenticators import IAMAuthenticator\nfrom TTS.api import TTS\nimport os\nimport elevenlabslib\nfrom contextlib import contextmanager\nimport pygame\nfrom pydub import AudioSegment\nimport io\nimport sys\nimport langid\n\nclass Voice:\n    def __init__(self, languages, **kwargs):   \n        # IBM CLOUD\n        try:\n            print('Authorizing IBM Cloud:')\n            url = kwargs['ibm_url']\n            apikey = kwargs['ibm_api']\n            # Setup Service\n            print('  1/3: Setting up cloud authenticator...')\n            authenticator = IAMAuthenticator(apikey)\n            # New tts service\n            print('  2/3: Setting up text-to-speech...')\n            tts = TextToSpeechV1(authenticator=authenticator)\n            # set serive url\n            print('  3/3: Setting up cloud service ...')\n            tts.set_service_url(url)\n            print('    ✓ service established\\n')\n            self.tts_service = tts\n        except:\n            print('IBM authentication failed')\n\n        if 'elevenlabs_api' in kwargs:\n            try:\n                eleven_labs_user = elevenlabslib.ElevenLabsUser(kwargs['elevenlabs_api'])\n                \n                if 'elevenlabs_voice' in list(kwargs.keys()):\n                    if kwargs['elevenlabs_voice'] in (voice.initialName for voice in eleven_labs_user.get_all_voices()):             \n                        self.elevenlabs_voice = eleven_labs_user.get_voices_by_name(kwargs['elevenlabs_voice'])[0]\n\n            except:\n                print('Couldn t connect with Elevenlabs')\n            # <to do: initiate Jarvis cloned voice if available and disable TTS>\n\n        # PYTTSX3 for backup plan\n        engine = pyttsx3.init()\n\n        # SYNTHETIC VOICES\n        # CoquiAI -  coqui-ai/TTS (https://github.com/coqui-ai/tts)\n        synth = TTS(model_name=os.path.join(\"tts_models/multilingual/multi-dataset/your_tts\"), progress_bar=False, gpu=True)  \n        \n        self.languages = languages\n        self.write_dir = kwargs['write_dir']\n        self.path = kwargs['voice_id']\n        print('cloning voice form:',self.path)\n        self.synthetic_voice = synth\n        self.offline = engine\n\n\n\n\n    def speak(self, text, VoiceIdx, mode, elevenlabs=False, IBM=False):\n        ## delete old last_aswer.wav to avoid conflicts\n        if os.path.exists((self.write_dir, \"last_answer.wav\")): os.remove((self.write_dir, \"last_answer.wav\"))\n\n        ## generate the speech: last_answer.wav\n        if mode == 'online':\n            if  elevenlabs==True:\n                if  VoiceIdx == 'en':\n                    try:\n                        audio = self.elevenlabs_voice.generate_audio_bytes(text)\n                        audio = AudioSegment.from_file(io.BytesIO(audio), format=\"mp3\")\n                        audio.export(os.path.join(self.write_dir, \"last_answer.wav\"), format=\"wav\")\n                    except Exception as e:\n                        print(f'Elevenlabs credit might have ended. {e}')\n                        raise Exception\n\n                if VoiceIdx == 'jarvis':\n                    # to do: use voice duplication from elevenlabs\n                    print('(ElevenLabs Jarvis voice not yet available)')\n                    raise Exception()\n\n            elif IBM==True:\n                with open(os.path.join(self.write_dir, \"last_answer.wav\"),'wb') as audio_file:\n                    try:\n                        if VoiceIdx=='jarvis':VoiceIdx='en'\n                        res = self.tts_service.synthesize(text, accept='audio/wav', voice=get_ibm_voice_id(VoiceIdx)).get_result()\n                        audio_file.write(res.content)\n                    except:\n                        print('(IBM credit might have ended)')\n                        raise Exception\n\n        if mode == 'offline': \n            if VoiceIdx == 'jarvis' and langid.classify(text)[0]=='en':\n                LangIdx = 'en'\n                print(self.path, LangIdx)\n                self.synthetic_voice.tts_to_file(text=text, speaker_wav=self.path[LangIdx], language=LangIdx, file_path=os.path.join(self.write_dir, 'last_answer.wav'))\n                \n                \n                \"\"\" Idea for multiple language Text-To-Speech: dictionaries\n                if VoiceIdx == 'other-language':\n                    self.synthetic_voice['other-language'].tts_to_file(text=text, speaker_wav=self.path, language=\"en\", file_path=os.path.join(self.DIRECTORIES['SOUND_DIR'], 'last_answer.wav'))\n                \"\"\"\n            else:\n                LangIdx = langid.classify(text)[0]\n                self.offline = self.change_offline_lang(lang_id=LangIdx)\n                self.offline.say(text)\n                self.offline.runAndWait()\n                return\n        \n        # play the generated speech:\n        if pygame.mixer.get_init() is None:pygame.mixer.init()\n        pygame.mixer.music.load(os.path.join(self.write_dir, 'last_answer.wav'))\n        pygame.mixer.music.set_volume(0.5)\n        pygame.mixer.music.play()\n        while(pygame.mixer.music.get_busy()): pass\n        return\n\n\n\n    def change_offline_lang(self, lang_id):\n        engine = pyttsx3.init()\n        try:\n            for voice in self.offline.getProperty('voices'):\n                if self.languages[lang_id] in voice.name:\n                    engine.setProperty('voice', voice.id)\n                    return engine\n            return engine\n        except Exception as e:    \n            print('error while switching to lang: ',lang_id,e)\n            return engine \n\n# know more at: https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices\ndef get_ibm_voice_id(VoiceIdx):\n    voices={\n        'ar':'ar-MS_OmarVoice',\n\n        'zh':'zh-CN_LiNaVoice',\n        'zh':'zh-CN_WangWeiVoice',\n        'zh':'zh-CN_ZhangJingVoice',\n\n        'cz':'cs-CZ_AlenaVoice',\n\n        'nl':'nl-BE_AdeleVoice',\n        'nl':'nl-BE_BramVoice',\n        'nl':'nl-NL_EmmaVoice',\n        'nl':'nl-NL_LiamVoice',\n        'nl':'nl-NL_MerelV3Voice',\n\n        'en':'en-GB_CharlotteV3Voice',\n        'en':'en-GB_JamesV3Voice',\n        'en':'en-GB_KateV3Voice',\n        'en':'en-US_AllisonV3Voice',\n        'en':'en-US_EmilyV3Voice',\n        'en':'en-US_HenryV3Voice',\n        'en':'en-US_KevinV3Voice',\n        'en':'en-US_LisaV3Voice',\n        'en':'en-US_MichaelV3Voice',\n        'en':'en-US_OliviaV3Voice',\n\n        'fr':'fr-CA_LouiseV3Voice',\n        'fr':'fr-FR_NicolasV3Voice',\n        'fr':'fr-FR_ReneeV3Voice',\n\n        'de':'de-DE_BirgitV3Voice',\n        'de':'de-DE_DieterV3Voice',\n        'de':'de-DE_ErikaV3Voice',\n\n        'it':'it-IT_FrancescaV3Voice',\n        'ja':'ja-JP_EmiV3Voice',\n        \n        'ko':'ko-KR_HyunjunVoice',\n        'ko':'ko-KR_SiWooVoice',\n        'ko':'ko-KR_YoungmiVoice',\n        'ko':'ko-KR_YunaVoice',\n        'ko':'ko-KR_JinV3Voice',\n\n        'pt':'pt-BR_IsabelaV3Voice',\n\n        'es':'es-ES_EnriqueV3Voice',\n        'es':'es-ES_LauraV3Voice',\n        'es':'es-LA_SofiaV3Voice',\n        'es':'es-US_SofiaV3Voice',\n\n        'sv':'sv-SE_IngridVoice'\n        }\n    return voices[VoiceIdx]\n\n\n@contextmanager\ndef suppress_stdout():\n    with open(os.devnull, \"w\") as devnull:\n        old_stdout = sys.stdout\n        sys.stdout = devnull\n        try:  \n            yield\n        finally:\n            sys.stdout = old_stdout"
  },
  {
    "path": "Assistant/webui.py",
    "content": "import json \nimport requests\nimport re\nimport langid\n\nSERVER = 'localhost'\nTEXT_GEN_PARAMS = {\n    'max_new_tokens': 200,\n    'do_sample': True,\n    'temperature': 0.72,\n    'top_p': 0.73,\n    'typical_p': 1,\n    'repetition_penalty': 1.1,\n    'encoder_repetition_penalty': 1.0,\n    'top_k': 0,\n    'min_length': 0,\n    'no_repeat_ngram_size': 0,\n    'num_beams': 1,\n    'penalty_alpha': 0,\n    'length_penalty': 1,\n    'early_stopping': False,\n    'seed': -1,\n    'add_bos_token': True,\n    'custom_stopping_strings': [],\n    'truncation_length': 2048,\n    'ban_eos_token': False,\n}\n\ndef set_text_gen_params(**kwargs):\n    for key in kwargs:\n        if key not in list(TEXT_GEN_PARAMS.keys()): raise Exception('no such parameter in oogabooga text generation')\n        TEXT_GEN_PARAMS[key]=kwargs[key]\n\n\ndef oobabooga_textgen(prompt, params=TEXT_GEN_PARAMS, server=SERVER):\n    ChatMode = True if type(prompt) == list else False\n    \n    if ChatMode: \n        nMessages = len(prompt)\n        prompt = parse_conversation(prompt) \n\n    payload = json.dumps([prompt, params])\n    APIresponse = requests.post(f\"http://{server}:7860/run/textgen\", json={\n        \"data\": [\n            payload\n        ]\n    }).json()\n    reply = APIresponse[\"data\"][0]\n    \n    # hallucination filter:\n    if ChatMode:\n        reply = reply.replace(\"[assistant]:\",\"###\")\n        reply = reply.replace(\"[user]:\",\"###\")\n        reply = reply.replace(\"[system]:\",\"###\")\n        reply = reply.split('###')\n        reply = \" \".join(reply[(nMessages+1):(nMessages+2)])\n    \n    return reply\n    \ndef post_process(answer):\n    allowed = ['Answer','Outcome','Discussion','Conclusion']\n    answer = answer.split('[Question]')[-1]\n\n    relevant =''\n    for a in allowed:\n        if a in answer:\n            temp = re.split(r'\\[|\\]', answer)\n            try:\n                relevant += temp[temp.index(a)+1].strip(':')\n            except:\n                print('Failure processing answer')\n                pass\n        \n    print(len(relevant.split()))\n    return relevant\n    \ndef parse_conversation(chat):\n    linkDetectionRegexStr = \"[a-zA-Z0-9]((?i) dot |(?i) dotcom|(?i)dotcom|(?i)dotcom |\\.|\\. | \\.| \\. |\\,)[a-zA-Z]*((?i) slash |(?i) slash|(?i)slash |(?i)slash|\\/|\\/ | \\/| \\/ ).+[a-zA-Z0-9]\"\n    oobaboogaChatHistory = \"\"\n    for message in chat:\n        oobaboogaChatHistory += f\"[{str(message['role'])}]:{message['content']}\\n\"\n    oobaboogaChatHistory = re.sub(linkDetectionRegexStr, \"<url>\", oobaboogaChatHistory)\n    return oobaboogaChatHistory"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2023 Gianmarco Guarnier\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# JARVIS-ChatGPT: A conversational assistant equipped with J.A.R.V.I.S's voice\r\n**A voice-based interactive assistant equipped with a variety of synthetic voices (including J.A.R.V.I.S's voice from IronMan)**\r\n\r\n![GitHub last commit](https://img.shields.io/github/last-commit/gianmarcoguarnier/JARVIS-ChatGPT?style=for-the-badge)\r\n<p align=\"center\">\r\n  <img src=\"https://user-images.githubusercontent.com/49094051/227788148-a8ff8e06-86a4-41a6-aa53-8b7d6855360c.png\"/>\r\n  <span style=color:grey> <i>image by MidJourney AI </i> </span>\r\n</p>\r\n\r\nEver dreamed to ask hyper-intelligent system tips to improve your armor? Now you can! Well, maybe not the armor part... This project exploits OpenAI Whisper, OpenAI ChatGPT and IBM Watson.\r\n<p align=\"center\"> <strong> PROJECT MOTIVATION:  </strong> </p> \r\n\r\n*Many times ideas come in the worst moment and they fade away before you have the time to explore them better. The objective of this project is to develop a system capable of giving tips and opinions in quasi-real-time about anything you ask. The ultimate assistant will be able to be accessed from any authorized microphone inside your house or your phone, it should run constantly in the background and when summoned should be able to generate meaningful answers (with a badass voice) as well as interface with the pc or a server and save/read/write files that can be accessed later. It should be able to run research, gather material from the internet (extract content from HTML pages, transcribe Youtube videos, find scientific papers...) and provide summaries that can be used as context to make informed decisions. In addition, it might interface with some external gadgets (IoT) but that's extra.*\r\n<br>\r\n<br>\r\n<br>\r\n\r\n<p align=\"center\"> <strong> DEMO: </strong> </p> \r\n\r\nhttps://user-images.githubusercontent.com/49094051/231303323-9859e028-33e1-490d-9967-44852fd0efc5.mp4\r\n\r\n<br>\r\n\r\n---\r\n## JULY 14th 2023 UPDATE: Research Mode\r\nI can finnaly share the first draft of the Research Mode. This modality was thought for people often dealing with research papers. \r\n- Switch to research mode by saying *'Switch to Research Mode'*\r\n- :star: Initialize a new workspace like this: *'Initialize a new workspace about Carbon Fiber Applications in the Spacecraft industry'*. A workspace is a folder that collects and organize the results of the research. This protocol is subdivided into 3 sub-routines:\r\n   1. Core Paper identification: Use the **Semantic Scholar API** to identify some strongly relevant papers;\r\n   2. Core Expansion: for each paper, finds some suggestions, then keep only the suggestions that appear to be similar to at least 2 paper;\r\n   3. Refy Expansion: use the refy suggestion package to enlarge the results;\r\n- Find suggestions like: *'find suggestions that are sililar to the paper with title ...'*\r\n- Download: *'download the paper with title ...'*\r\n- :star: Query your database like: *'what is the author of the paper with title ...?'*  *'what are the experimental conditions set for the paper with title ...?'*\r\n\r\nPS: This mode is not super stable and needs to be worked on<br>\r\n\r\n*PPS: This project will be discontinued for some time since I'll be working on my thesis until 2024. However there are already so many things that can be improved so I'll be back!*\r\n## What you'll need:\r\n<p align=\"center\"><i>DISCLAIMER:<br> The project might consume your OpenAI credit resulting in undesired billing;<br> I don't take responsibility for any unwanted charges;<br>Consider setting limitations on credit consumption at your OpenAI account; </i> </p> \r\n\r\n - An [OpenAI](https://openai.com) account and API key; (check FAQs below for the alternatives)\r\n - <i>[PicoVoice](https://picovoice.ai/platform/porcupine/) account and a free AccessKey; (optional) </i>\r\n - <i>[ElevenLabs](https://beta.elevenlabs.io/) account and free Api Key (optional)</i>;\r\n - [langChain API keys](https://github.com/hwchase17/langchain/blob/master/docs/modules/agents/tools/getting_started.md) for web surfing (news, weather, serpapi, google-serp, google-search... they are all free)\r\n - [ffmpeg](https://ffmpeg.org/) ;\r\n - Python virtual environment (Python>=3.9 and <3.10);\r\n - <i> Some credit to spend on ChatGPT (you can get three months of free usage by signing up to OpenAI) (suggested)</i>;\r\n - CUDA version >= 11.2;\r\n - <i> An IBM Cloud account to exploit their cloud-based text-to-speech models ([tutorial](https://www.youtube.com/watch?v=A9_0OgW1LZU))(optional)</i>;\r\n - A (reasonably) fast internet connection (most of the code relies on API so a slower connection might result in a longer time to respond);\r\n - mic and speaker;\r\n - CUDA capable graphic engine (my Torch Version: 2.0 and CUDA v11.7 ```pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117```);\r\n - Patience :sweat_smile:\r\n\r\n> you can rely on the new ```setup.bat``` that will do most of the things for you.\r\n\r\n\r\n## GitHub overview\r\n**MAIN** script you should run: `openai_api_chatbot.py` if you want to use the latest version of the OpenAI API Inside the demos folder you'll find some guidance for the packages used in the project, if you have errors you might check these files first to target the problem. Mostly is stored in the Assistant folder: `get_audio.py` stores all the functions to handle mic interactions, `tools.py` implements some basic aspects of the Virtual Assistant, `voice.py` describes a (very) rough Voice class. ```Agents.py``` handle the LangChain part of the system (here you can add or remove tools from the toolkits of the agents)<br> The remaining scripts are supplementary to the voice generation and should not be edited. \r\n\r\n# INSTALLATION TUTORIAL\r\n## Automatic installation\r\nYou can run ```setup.bat``` if you are running on Windows/Linux. The script will perform every step of the manual installation in sequence. Refer to those in case the procedure should fail.<br>\r\nThe automatic installation will also run the Vicuna installation ([Vicuna Installation Guide](https://hub.tcno.co/ai/text-ai/vicuna/))\r\n## Manual Installation\r\n## Step 1: installation, accounts, APIs... \r\n### Environment\r\n1. Make a new, empty virtual environment with Python 3.8 and activate it (.\\venv_name\\Scripts\\activate );\r\n2. ```pip install -r venv_requirements.txt```; This might take some time; if you encounter conflicts on specific packages, install them manually without the ```==<version>```;\r\n3. install manually PyTorch according to your CUDA VERSION;\r\n4. Copy and paste the files you'll find in the folder ```whisper_edits``` to the ```whisper``` folder of your environment (.\\venv\\lib\\site-packages\\whisper\\ ) <span style=\"color:grey\"> these edits will add just an attribute to the whisper model to access its dimension more easily; </span> \r\n5. install [TTS](https://github.com/coqui-ai/tts);\r\n6. Run [their script](https://github.com/coqui-ai/TTS/blob/dev/README.md#-python-api) and check everything is working (it should download some models) (you can alternatively run ```demos/tts_demo.py```);\r\n7. Rename or delete the TTS folder and download the Assistant and other scripts from this repo \r\n9. Install Vicuna following the instructions on the Vicuna folder or by running:<br><p align='center'>\r\n```cd Vicuna```<br>\r\n```call vicuna.ps1```<br></p>\r\n<span style=\"color:grey\"> Manual instructions will instruct you to follow the [Vicuna Installation Guide](https://hub.tcno.co/ai/text-ai/vicuna/) </span> \r\n10. paste all your keys in the ```env.txt``` file and rename it to ```.env``` (yes, remove the txt extension)\r\n11. Check everything works *(following)*\r\n<br>\r\n\r\n### Checks\r\n- Verify your graphic engine and CUDA version are compatible with PyTorch by running `torch.cuda.is_available()` and `torch.cuda.get_device_name(0)` inside Pyhton; . \r\n- run ```tests.py```. This file attempt to perform basic operations that might raise errors;\r\n- [WARNING] Check the FAQs below if you have errors;\r\n- You can check the sources of error by running demos in the demos folder;\r\n\r\n\r\n## Step 2: Language support\r\n- To have answers spoken in your language you should first check if your language is supported by the speech generator at __https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices__; \r\n- If it's supported, add or change the languages inside ```VirtualAssistant.__init__()``` ;<br>\r\n\r\n<p align=\"center\">\r\n  <img src=\"https://user-images.githubusercontent.com/49094051/230505516-4dba0f29-f45a-4311-aa54-1d93fca25de5.PNG\"/>\r\n</p>\r\n\r\n- Remember: The loaded Whisper is the medium one. If it performs badly in your language, upgrade to the larger one in the ```__main__()``` at `whisper_model = whisper.load_model(\"large\")`; but I hope your GPU memory is large likewise.\r\n\r\n## Step 3: Running (`openai_api_chatbot.py`):\r\nWhen running, you'll see much information being displayed. I'm constantly striving to improve the readability of the execution, the whole project is a huge beta, forgive slight variations from the screens below. Anyway, this is what happens in general terms when you hit 'run':\r\n- Preliminary initializations take place, you should hear a chime when the Assistant is ready;\r\n- When *awaiting for triggering words* is displayed you'll need to say `Jarvis` to summon the assistant. At this point, a conversation will begin and you can speak in whatever language you want (if you followed step 2). The conversation will terminate when you 1) say a [stop word](https://github.com/gianmarcoguarnier/JARVIS-ChatGPT/tree/main#key-words) 2) say something with one word (like 'ok') 3) when you stop making questions for more than 30 seconds <br>\r\n<p align=\"center\">\r\n  <img src=\"https://user-images.githubusercontent.com/49094051/230505896-c8a2ff80-4265-41e4-a6d5-e9f56d156afa.PNG\" /><br>\r\n  <img src=\"https://user-images.githubusercontent.com/49094051/230506756-287a1d6b-9652-4c66-bea8-cd75380ab45b.PNG\" /><br>\r\n</p>\r\n\r\n- After the magic word is said, the word *listening...* should then appear. At this point, you can make your question. When you are done just wait (3 seconds) for the answer to be submitted;\r\n- The script will convert the recorded audio to text using Whisper;\r\n- The text will be analyzed and a decision will be made. If the Assistant believes it needs to take some action to respond (like looking for a past conversation) the langchain agents will make a plan and use their tool to answer.\r\n- Elsewise, the script will then expand the `chat_history` with your question, it will send a request with the API and it will update the history as soon as it receives a full answer from ChatGPT (this may take up to 5-10 seconds, consider explicitly asking for a short answer if you are in a hurry);\r\n- The `say()` function will perform voice duplication to speak with Jarvis/Someone's voice; if the argument is not in English, IBM Watson will send the response from one of their nice text-to-speech models. If everything fails, the functions will rely on pyttsx3 which is a fast yet not as cool alternative;\r\n<p align=\"center\">\r\n\r\n</p>\r\n\r\n- When any of the stop keywords are said, the script will ask ChatGPT to give a title to the conversation and will save the chat in a .txt file with the format 'CurrentDate_Title.txt';\r\n- The assistant will then go back to sleep;\r\n<p align=\"center\">\r\n <img src='https://user-images.githubusercontent.com/49094051/227788180-b9da0957-a58b-4c1c-bc34-4a4c8a0e0957.PNG'/><br>\r\n  <i><span style=\"color:grey\">I made some prompts and closed the conversation</span> </i>\r\n</p>\r\n\r\n\r\n# Keywords:\r\n- to stop or save the chat, just say 'THANKS' at some point;\r\n- To summon JARVIS voice just say 'JARVIS' at some point;\r\n\r\n<span style=\"color:grey\">*not ideal I know but works for now*</span>\r\n\r\n\r\n# History:\r\n- [x] [11 - 2022] Deliver chat-like prompts from Python from a keyboard\r\n- [x] [12 - 2022] Deliver chat-like prompts from Python with voice\r\n- [x] [2  - 2023] International language support for prompt and answers\r\n- [x] [3  - 2023] Jarvis voice set up\r\n- [x] [3  - 2023] Save conversation\r\n- [x] [3  - 2023] Background execution & Voice Summoning\r\n- [x] [3  - 2023] Improve output displayed info\r\n- [x] [3  - 2023] Improve JARVIS's voice performances through prompt preprocessing\r\n- [x] [4  - 2023] Introducing: *Project memory* store chats, events, timelines and other relevant information for a given project to be accessed later by the user or the assistant itself \r\n- [x] [4  - 2023] Create a full stack ```VirtualAssistant``` class with memory and local storage access\r\n- [x] [4  - 2023] Add sound feedback at different stages (chimes, beeps...)\r\n- [x] [4  - 2023] International language support for voice commands (beta)\r\n- [x] [4  - 2023] Making a step-by-step tutorial \r\n- [x] [4  - 2023] Move some processing locally to reduce credit consumption: [Vicuna: A new, powerful model based on LLaMa, and trained with GPT-4](https://www.youtube.com/watch?v=ByV5w1ES38A&ab_channel=TroubleChute);\r\n- [x] [4  - 2023] Integrate with Eleven Labs Voices for super expressive voices and outstanding voice cloning;\r\n- [x] [4  - 2023] Extending voice commands and *Actions* (make a better active assistant)\r\n- [x] [4  - 2023] Connect the system to the internet\r\n- [x] [6  - 2023] Connect with paper database\r\n\r\ncurrently working on:\r\n- [ ] Extend doc processing tools\r\n- [ ] Find a free alternative for LangChain Agents\r\n\r\nfollowing:\r\n- [ ] fixing chat length bug (when the chat is too long it can't be processed by ChatGPT 3.5 Turbo)\r\n- [ ] expanding *Memory* \r\n- [ ] crash reports   \r\n- [ ] Refine capabilities\r\n<br>\r\n<br>\r\n\r\n### waiting for ChatGPT4 to:\r\n- [ ] add multimodal input (i.e. \"Do you think 'this' [holding a paper plane] could fly\" -> camera -> ChatGPT4 -> \"you should improve the tip of the wings\" )\r\n- [ ] Extend *project memory* to images, pdfs, papers...\r\n\r\n<span style=\"color:grey\">*Check the [UpdateHistory.md](https://github.com/gianmarcoguarnier/JARVIS-ChatGPT/blob/main/UpdateHistory.md) of the project for more insights.*</span>\r\n\r\nHave fun!\r\n\r\n# ERRORS and FAQs\r\ncategories: Install, General, Runtime\r\n### INSTALL: I have conflicting packages while installing *venv_requirements.txt*, what should I do? <br>\r\n1. Make sure you have the right Python version (3.7) on the .venv (>python --version with the virtual environment activated). \r\n2. Try to edit the _venv_requirements.txt_ and remove the version requirements of the incriminated dependencies. \r\n3. Straight remove the package from the txt file and install them manually afterward.<br>\r\n\r\n### INSTALL: I meet an error when running openai_api_chatbot.py saying: TypeError: LoadLibrary( ) argument 1 must be str, not None what's wrong? <br>\r\nThe problem is concerning Whisper. You should re-install it manually  with ```pip install whisper-openai``` <br>\r\n\r\n### INSTALL: I can't import 'openai.embeddings_utils'<br>\r\n1. Try to ```pip install --upgrade openai```. \r\n2. This happens because openai elevated their minimum requirements. I had this problem and solved by manually downloading [embeddings_utils.py](https://github.com/openai/openai-python/blob/main/openai/embeddings_utils.py) inside ./<your_venv>/Lib/site-packages/openai/ \r\n<br>\r\n3. If the problem persists with ```datalib``` raise an issue and I'll provide you the missing file\r\n4. upgrade to Python 3.8 (create new env and re-install TTS, requirements)\r\n\r\n### INSTALL: I encounter the error ModuleNotFoundError: No module named '\\<some module\\>' <br>\r\nRequirements are not updated every commit. While this might generate errors you can quickly install the missing modules, at the same time it keeps the environment clean from conflicts when I try new packages (and I try LOTS of them) <br>\r\n\r\n### RUN TIME: I encounter some OOM memory when loading the Whisper model, what does it mean?<br>\r\nIt means the model you selected is too big for your CUDA device memory. Unfortunately, there is not much you can do about it except load a smaller model. If the smaller model does not satisfy you, you might want to speak 'clearer' or make longer prompts to let the model predict more accurately what you are saying. This sounds inconvenient but, in my case, greatly improved my English-speaking :) <br>\r\n\r\n### RUN TIME: Max length tokens for ChatGPT-3.5-Turbo is 4096 but received... tokens.<br>\r\nThis is a bug still present, don't expect to have ever long conversations with your assistant as it will simply have enough memory to remember the whole conversation at some point. A fix is in development, it might consist of adopting a 'sliding windows' approach even if it might cause repetition of some concepts. <br>\r\n\r\n### GENERAL: I finished my OPENAI credit/demo, what can I do? <br>\r\n1. Go online only. The price is not that bad and you might end up paying a few dollars a month since pricing depends on usage (with heavy testing I ended up consuming the equivalent of ~4 dollars a month during my free trial). You can set limits on your monthly tokens consumption. \r\n2. Use a Hybrid mode where the most credit-intensive tasks are executed locally for free and the rest is done online. \r\n3. Install Vicuna and run OFFLINE mode only with limited performance. \r\n\r\n### GENERAL: For how long will this project be updated? \r\nRight now (April 2023) I'm working almost non-stop on this. I will likely take a break in the summer because I'll be working on my thesis. \r\n\r\nIf you have questions you can contact me by raising an Issue and I'll do my best to help as soon as possible.\r\n<p align=\"right\"><i>Gianmarco Guarnier<i></p>\r\n"
  },
  {
    "path": "TTS/.models.json",
    "content": "{\n    \"tts_models\": {\n        \"multilingual\":{\n            \"multi-dataset\":{\n                \"your_tts\":{\n                    \"description\": \"Your TTS model accompanying the paper https://arxiv.org/abs/2112.02418\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.10.1_models/tts_models--multilingual--multi-dataset--your_tts.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": \"e9a1953e\",\n                    \"license\": \"CC BY-NC-ND 4.0\",\n                    \"contact\": \"egolge@coqui.ai\"\n                }\n            }\n        },\n        \"bg\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--bg--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"cs\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--cs--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"da\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--da--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"et\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--et--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"ga\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--ga--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"en\": {\n            \"ek1\": {\n                \"tacotron2\": {\n                    \"description\": \"EK1 en-rp tacotron2 by NMStoker\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--en--ek1--tacotron2.zip\",\n                    \"default_vocoder\": \"vocoder_models/en/ek1/wavegrad\",\n                    \"commit\": \"c802255\",\n                    \"license\": \"apache 2.0\"\n                }\n            },\n            \"ljspeech\": {\n                \"tacotron2-DDC\": {\n                    \"description\": \"Tacotron2 with Double Decoder Consistency.\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--en--ljspeech--tacotron2-DDC.zip\",\n                    \"default_vocoder\": \"vocoder_models/en/ljspeech/hifigan_v2\",\n                    \"commit\": \"bae2ad0f\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"egolge@coqui.com\"\n                },\n                \"tacotron2-DDC_ph\": {\n                    \"description\": \"Tacotron2 with Double Decoder Consistency with phonemes.\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--en--ljspeech--tacotron2-DDC_ph.zip\",\n                    \"default_vocoder\": \"vocoder_models/en/ljspeech/univnet\",\n                    \"commit\": \"3900448\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"egolge@coqui.com\"\n                },\n                \"glow-tts\": {\n                    \"description\": \"\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--en--ljspeech--glow-tts.zip\",\n                    \"stats_file\": null,\n                    \"default_vocoder\": \"vocoder_models/en/ljspeech/multiband-melgan\",\n                    \"commit\": \"\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"MPL\",\n                    \"contact\": \"egolge@coqui.com\"\n                },\n                \"speedy-speech\": {\n                    \"description\": \"Speedy Speech model trained on LJSpeech dataset using the Alignment Network for learning the durations.\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--en--ljspeech--speedy-speech.zip\",\n                    \"stats_file\": null,\n                    \"default_vocoder\": \"vocoder_models/en/ljspeech/hifigan_v2\",\n                    \"commit\": \"4581e3d\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"egolge@coqui.com\"\n                },\n                \"tacotron2-DCA\": {\n                    \"description\": \"\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--en--ljspeech--tacotron2-DCA.zip\",\n                    \"default_vocoder\": \"vocoder_models/en/ljspeech/multiband-melgan\",\n                    \"commit\": \"\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"MPL\",\n                    \"contact\": \"egolge@coqui.com\"\n                },\n                \"vits\": {\n                    \"description\": \"VITS is an End2End TTS model trained on LJSpeech dataset with phonemes.\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--en--ljspeech--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": \"3900448\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"egolge@coqui.com\"\n                },\n                \"vits--neon\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--en--ljspeech--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\",\n                    \"contact\": null,\n                    \"commit\": null\n                },\n                \"fast_pitch\": {\n                    \"description\": \"FastPitch model trained on LJSpeech using the Aligner Network\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--en--ljspeech--fast_pitch.zip\",\n                    \"default_vocoder\": \"vocoder_models/en/ljspeech/hifigan_v2\",\n                    \"commit\": \"b27b3ba\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"egolge@coqui.com\"\n                },\n                \"overflow\": {\n                    \"description\": \"Overflow model trained on LJSpeech\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.10.0_models/tts_models--en--ljspeech--overflow.zip\",\n                    \"default_vocoder\": \"vocoder_models/en/ljspeech/hifigan_v2\",\n                    \"commit\": \"3b1a28f\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"egolge@coqui.ai\"\n                },\n                \"neural_hmm\": {\n                    \"description\": \"Neural HMM model trained on LJSpeech\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.11.0_models/tts_models--en--ljspeech--neural_hmm.zip\",\n                    \"default_vocoder\": \"vocoder_models/en/ljspeech/hifigan_v2\",\n                    \"commit\": \"3b1a28f\",\n                    \"author\": \"Shivam Metha @shivammehta25\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"d83ee8fe45e3c0d776d4a865aca21d7c2ac324c4\"\n                }\n            },\n            \"vctk\": {\n                \"vits\": {\n                    \"description\": \"VITS End2End TTS model trained on VCTK dataset with 109 different speakers with EN accent.\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--en--vctk--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": \"3900448\",\n                    \"author\": \"Eren @erogol\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"egolge@coqui.ai\"\n                },\n                \"fast_pitch\":{\n                    \"description\": \"FastPitch model trained on VCTK dataseset.\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--en--vctk--fast_pitch.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": \"bdab788d\",\n                    \"author\": \"Eren @erogol\",\n                    \"license\": \"CC BY-NC-ND 4.0\",\n                    \"contact\": \"egolge@coqui.ai\"\n                }\n            },\n            \"sam\": {\n                \"tacotron-DDC\": {\n                    \"description\": \"Tacotron2 with Double Decoder Consistency trained with Aceenture's Sam dataset.\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--en--sam--tacotron-DDC.zip\",\n                    \"default_vocoder\": \"vocoder_models/en/sam/hifigan_v2\",\n                    \"commit\": \"bae2ad0f\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"egolge@coqui.com\"\n                }\n            },\n            \"blizzard2013\": {\n                \"capacitron-t2-c50\": {\n                    \"description\": \"Capacitron additions to Tacotron 2 with Capacity at 50 as in https://arxiv.org/pdf/1906.03402.pdf\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.7.0_models/tts_models--en--blizzard2013--capacitron-t2-c50.zip\",\n                    \"commit\": \"d6284e7\",\n                    \"default_vocoder\": \"vocoder_models/en/blizzard2013/hifigan_v2\",\n                    \"author\": \"Adam Froghyar @a-froghyar\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"adamfroghyar@gmail.com\"\n                },\n                \"capacitron-t2-c150_v2\": {\n                    \"description\": \"Capacitron additions to Tacotron 2 with Capacity at 150 as in https://arxiv.org/pdf/1906.03402.pdf\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.7.1_models/tts_models--en--blizzard2013--capacitron-t2-c150_v2.zip\",\n                    \"commit\": \"a67039d\",\n                    \"default_vocoder\": \"vocoder_models/en/blizzard2013/hifigan_v2\",\n                    \"author\": \"Adam Froghyar @a-froghyar\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"adamfroghyar@gmail.com\"\n                }\n            }\n        },\n        \"es\": {\n            \"mai\": {\n                \"tacotron2-DDC\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--es--mai--tacotron2-DDC.zip\",\n                    \"default_vocoder\": \"vocoder_models/universal/libri-tts/fullband-melgan\",\n                    \"commit\": \"\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"MPL\",\n                    \"contact\": \"egolge@coqui.com\"\n                }\n            },\n            \"css10\":{\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--es--css10--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"fr\": {\n            \"mai\": {\n                \"tacotron2-DDC\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--fr--mai--tacotron2-DDC.zip\",\n                    \"default_vocoder\": \"vocoder_models/universal/libri-tts/fullband-melgan\",\n                    \"commit\": null,\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"MPL\",\n                    \"contact\": \"egolge@coqui.com\"\n                }\n            },\n            \"css10\":{\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--fr--css10--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"uk\":{\n            \"mai\": {\n                \"glow-tts\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--uk--mai--glow-tts.zip\",\n                    \"author\":\"@robinhad\",\n                    \"commit\": \"bdab788d\",\n                    \"license\": \"MIT\",\n                    \"contact\": \"\",\n                    \"default_vocoder\": \"vocoder_models/uk/mai/multiband-melgan\"\n                },\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--uk--mai--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"zh-CN\": {\n            \"baker\": {\n                \"tacotron2-DDC-GST\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--zh-CN--baker--tacotron2-DDC-GST.zip\",\n                    \"commit\": \"unknown\",\n                    \"author\": \"@kirianguiller\",\n                    \"license\": \"apache 2.0\",\n                    \"default_vocoder\": null\n                }\n            }\n        },\n        \"nl\": {\n            \"mai\": {\n                \"tacotron2-DDC\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--nl--mai--tacotron2-DDC.zip\",\n                    \"author\": \"@r-dh\",\n                    \"license\": \"apache 2.0\",\n                    \"default_vocoder\": \"vocoder_models/nl/mai/parallel-wavegan\",\n                    \"stats_file\": null,\n                    \"commit\": \"540d811\"\n                }\n            },\n            \"css10\":{\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--nl--css10--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"de\": {\n            \"thorsten\": {\n                \"tacotron2-DCA\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--de--thorsten--tacotron2-DCA.zip\",\n                    \"default_vocoder\": \"vocoder_models/de/thorsten/fullband-melgan\",\n                    \"author\": \"@thorstenMueller\",\n                    \"license\": \"apache 2.0\",\n                    \"commit\": \"unknown\"\n                },\n                \"vits\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.7.0_models/tts_models--de--thorsten--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"author\": \"@thorstenMueller\",\n                    \"license\": \"apache 2.0\",\n                    \"commit\": \"unknown\"\n                },\n                \"tacotron2-DDC\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--de--thorsten--tacotron2-DDC.zip\",\n                    \"default_vocoder\": \"vocoder_models/de/thorsten/hifigan_v1\",\n                    \"description\": \"Thorsten-Dec2021-22k-DDC\",\n                    \"author\": \"@thorstenMueller\",\n                    \"license\": \"apache 2.0\",\n                    \"commit\": \"unknown\"\n                }\n            },\n            \"css10\": {\n                \"vits-neon\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--de--css10--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\",\n                    \"commit\": null\n                }\n            }\n        },\n        \"ja\": {\n            \"kokoro\": {\n                \"tacotron2-DDC\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--ja--kokoro--tacotron2-DDC.zip\",\n                    \"default_vocoder\": \"vocoder_models/ja/kokoro/hifigan_v1\",\n                    \"description\": \"Tacotron2 with Double Decoder Consistency trained with Kokoro Speech Dataset.\",\n                    \"author\": \"@kaiidams\",\n                    \"license\": \"apache 2.0\",\n                    \"commit\": \"401fbd89\"\n                }\n            }\n        },\n        \"tr\":{\n            \"common-voice\": {\n                \"glow-tts\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--tr--common-voice--glow-tts.zip\",\n                    \"default_vocoder\": \"vocoder_models/tr/common-voice/hifigan\",\n                    \"license\": \"MIT\",\n                    \"description\": \"Turkish GlowTTS model using an unknown speaker from the Common-Voice dataset.\",\n                    \"author\": \"Fatih Akademi\",\n                    \"commit\": null\n                }\n            }\n        },\n        \"it\": {\n            \"mai_female\": {\n                \"glow-tts\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--it--mai_female--glow-tts.zip\",\n                    \"default_vocoder\": null,\n                    \"description\": \"GlowTTS model as explained on https://github.com/coqui-ai/TTS/issues/1148.\",\n                    \"author\": \"@nicolalandro\",\n                    \"license\": \"apache 2.0\",\n                    \"commit\": null\n                },\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--it--mai_female--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"description\": \"GlowTTS model as explained on https://github.com/coqui-ai/TTS/issues/1148.\",\n                    \"author\": \"@nicolalandro\",\n                    \"license\": \"apache 2.0\",\n                    \"commit\": null\n                }\n            },\n            \"mai_male\": {\n                \"glow-tts\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--it--mai_male--glow-tts.zip\",\n                    \"default_vocoder\": null,\n                    \"description\": \"GlowTTS model as explained on https://github.com/coqui-ai/TTS/issues/1148.\",\n                    \"author\": \"@nicolalandro\",\n                    \"license\": \"apache 2.0\",\n                    \"commit\": null\n                },\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/tts_models--it--mai_male--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"description\": \"GlowTTS model as explained on https://github.com/coqui-ai/TTS/issues/1148.\",\n                    \"author\": \"@nicolalandro\",\n                    \"license\": \"apache 2.0\",\n                    \"commit\": null\n                }\n            }\n        },\n        \"ewe\": {\n            \"openbible\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.2_models/tts_models--ewe--openbible--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"license\": \"CC-BY-SA 4.0\",\n                    \"description\": \"Original work (audio and text) by Biblica available for free at www.biblica.com and open.bible.\",\n                    \"author\": \"@coqui_ai\",\n                    \"commit\": \"1b22f03\"\n                }\n            }\n        },\n        \"hau\": {\n            \"openbible\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.2_models/tts_models--hau--openbible--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"license\": \"CC-BY-SA 4.0\",\n                    \"description\": \"Original work (audio and text) by Biblica available for free at www.biblica.com and open.bible.\",\n                    \"author\": \"@coqui_ai\",\n                    \"commit\": \"1b22f03\"\n                }\n            }\n        },\n        \"lin\": {\n            \"openbible\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.2_models/tts_models--lin--openbible--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"license\": \"CC-BY-SA 4.0\",\n                    \"description\": \"Original work (audio and text) by Biblica available for free at www.biblica.com and open.bible.\",\n                    \"author\": \"@coqui_ai\",\n                    \"commit\": \"1b22f03\"\n                }\n            }\n        },\n        \"tw_akuapem\": {\n            \"openbible\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.2_models/tts_models--tw_akuapem--openbible--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"license\": \"CC-BY-SA 4.0\",\n                    \"description\": \"Original work (audio and text) by Biblica available for free at www.biblica.com and open.bible.\",\n                    \"author\": \"@coqui_ai\",\n                    \"commit\": \"1b22f03\"\n                }\n            }\n        },\n        \"tw_asante\": {\n            \"openbible\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.2_models/tts_models--tw_asante--openbible--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"license\": \"CC-BY-SA 4.0\",\n                    \"description\": \"Original work (audio and text) by Biblica available for free at www.biblica.com and open.bible.\",\n                    \"author\": \"@coqui_ai\",\n                    \"commit\": \"1b22f03\"\n                }\n            }\n        },\n        \"yor\": {\n            \"openbible\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.2_models/tts_models--yor--openbible--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"license\": \"CC-BY-SA 4.0\",\n                    \"description\": \"Original work (audio and text) by Biblica available for free at www.biblica.com and open.bible.\",\n                    \"author\": \"@coqui_ai\",\n                    \"commit\": \"1b22f03\"\n                }\n            }\n        },\n        \"hu\": {\n            \"css10\": {\n                \"vits\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--hu--css10--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"el\": {\n            \"cv\": {\n                \"vits\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--el--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"fi\": {\n            \"css10\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--fi--css10--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"hr\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--hr--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"lt\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--lt--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"lv\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--lv--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"mt\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--mt--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"pl\": {\n            \"mai_female\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--pl--mai_female--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"pt\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--pt--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"ro\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--ro--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"sk\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--sk--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"sl\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--sl--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"sv\": {\n            \"cv\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/tts_models--sv--cv--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"author\": \"@NeonGeckoCom\",\n                    \"license\": \"bsd-3-clause\"\n                }\n            }\n        },\n        \"ca\": {\n            \"custom\": {\n                \"vits\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.10.1_models/tts_models--ca--custom--vits.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"description\": \" It is trained from zero with 101460 utterances consisting of 257 speakers, approx 138 hours of speech. We used three datasets;\\nFestcat and Google Catalan TTS (both TTS datasets) and also a part of Common Voice 8. It is trained with TTS v0.8.0.\\nhttps://github.com/coqui-ai/TTS/discussions/930#discussioncomment-4466345\",\n                    \"author\": \"@gullabi\",\n                    \"license\": \"CC-BY-4.0\"\n                }\n            }\n        },\n        \"fa\":{\n            \"custom\":{\n                \"glow-tts\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.10.1_models/tts_models--fa--custom--glow-tts.zip\",\n                    \"default_vocoder\": null,\n                    \"commit\": null,\n                    \"description\": \"persian-tts-female-glow_tts model for text to speech purposes. Single-speaker female voice Trained on persian-tts-dataset-famale. \\nThis model has no compatible vocoder thus the output quality is not very good. \\nDataset: https://www.kaggle.com/datasets/magnoliasis/persian-tts-dataset-famale.\",\n                    \"author\": \"@karim23657\",\n                    \"license\": \"CC-BY-4.0\"\n                }\n            }\n        }\n    },\n    \"vocoder_models\": {\n        \"universal\": {\n            \"libri-tts\": {\n                \"wavegrad\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--universal--libri-tts--wavegrad.zip\",\n                    \"commit\": \"ea976b0\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"MPL\",\n                    \"contact\": \"egolge@coqui.com\"\n                },\n                \"fullband-melgan\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--universal--libri-tts--fullband-melgan.zip\",\n                    \"commit\": \"4132240\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"MPL\",\n                    \"contact\": \"egolge@coqui.com\"\n                }\n            }\n        },\n        \"en\": {\n            \"ek1\": {\n                \"wavegrad\": {\n                    \"description\": \"EK1 en-rp wavegrad by NMStoker\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--en--ek1--wavegrad.zip\",\n                    \"commit\": \"c802255\",\n                    \"license\": \"apache 2.0\"\n                }\n            },\n            \"ljspeech\": {\n                \"multiband-melgan\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--en--ljspeech--multiband-melgan.zip\",\n                    \"commit\": \"ea976b0\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"MPL\",\n                    \"contact\": \"egolge@coqui.com\"\n                },\n                \"hifigan_v2\": {\n                    \"description\": \"HiFiGAN_v2 LJSpeech vocoder from https://arxiv.org/abs/2010.05646.\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--en--ljspeech--hifigan_v2.zip\",\n                    \"commit\": \"bae2ad0f\",\n                    \"author\": \"@erogol\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"egolge@coqui.ai\"\n                },\n                \"univnet\": {\n                    \"description\": \"UnivNet model finetuned on TacotronDDC_ph spectrograms for better compatibility.\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--en--ljspeech--univnet_v2.zip\",\n                    \"commit\": \"4581e3d\",\n                    \"author\": \"Eren @erogol\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"egolge@coqui.ai\"\n                }\n            },\n            \"blizzard2013\": {\n                \"hifigan_v2\": {\n                    \"description\": \"HiFiGAN_v2 LJSpeech vocoder from https://arxiv.org/abs/2010.05646.\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.7.0_models/vocoder_models--en--blizzard2013--hifigan_v2.zip\",\n                    \"commit\": \"d6284e7\",\n                    \"author\": \"Adam Froghyar @a-froghyar\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"adamfroghyar@gmail.com\"\n                }\n            },\n            \"vctk\": {\n                \"hifigan_v2\": {\n                    \"description\": \"Finetuned and intended to be used with tts_models/en/vctk/sc-glow-tts\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--en--vctk--hifigan_v2.zip\",\n                    \"commit\": \"2f07160\",\n                    \"author\": \"Edresson Casanova\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"\"\n                }\n            },\n            \"sam\": {\n                \"hifigan_v2\": {\n                    \"description\": \"Finetuned and intended to be used with tts_models/en/sam/tacotron_DDC\",\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--en--sam--hifigan_v2.zip\",\n                    \"commit\": \"2f07160\",\n                    \"author\": \"Eren Gölge @erogol\",\n                    \"license\": \"apache 2.0\",\n                    \"contact\": \"egolge@coqui.ai\"\n                }\n            }\n        },\n        \"nl\": {\n            \"mai\": {\n                \"parallel-wavegan\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--nl--mai--parallel-wavegan.zip\",\n                    \"author\": \"@r-dh\",\n                    \"license\": \"apache 2.0\",\n                    \"commit\": \"unknown\"\n                }\n            }\n        },\n        \"de\": {\n            \"thorsten\": {\n                \"wavegrad\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--de--thorsten--wavegrad.zip\",\n                    \"author\": \"@thorstenMueller\",\n                    \"license\": \"apache 2.0\",\n                    \"commit\": \"unknown\"\n                },\n                \"fullband-melgan\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--de--thorsten--fullband-melgan.zip\",\n                    \"author\": \"@thorstenMueller\",\n                    \"license\": \"apache 2.0\",\n                    \"commit\": \"unknown\"\n                },\n                \"hifigan_v1\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.8.0_models/vocoder_models--de--thorsten--hifigan_v1.zip\",\n                    \"description\": \"HifiGAN vocoder model for Thorsten Neutral Dec2021 22k Samplerate Tacotron2 DDC model\",\n                    \"author\": \"@thorstenMueller\",\n                    \"license\": \"apache 2.0\",\n                    \"commit\": \"unknown\"\n                }\n            }\n        },\n        \"ja\": {\n            \"kokoro\": {\n                \"hifigan_v1\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--ja--kokoro--hifigan_v1.zip\",\n                    \"description\": \"HifiGAN model trained for kokoro dataset by @kaiidams\",\n                    \"author\": \"@kaiidams\",\n                    \"license\": \"apache 2.0\",\n                    \"commit\": \"3900448\"\n                }\n            }\n        },\n        \"uk\": {\n            \"mai\": {\n                \"multiband-melgan\": {\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--uk--mai--multiband-melgan.zip\",\n                    \"author\":\"@robinhad\",\n                    \"commit\": \"bdab788d\",\n                    \"license\": \"MIT\",\n                    \"contact\": \"\"\n                }\n            }\n        },\n        \"tr\":{\n            \"common-voice\": {\n                \"hifigan\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.6.1_models/vocoder_models--tr--common-voice--hifigan.zip\",\n                    \"description\": \"HifiGAN model using an unknown speaker from the Common-Voice dataset.\",\n                    \"author\": \"Fatih Akademi\",\n                    \"license\": \"MIT\",\n                    \"commit\": null\n                }\n            }\n        }\n    },\n    \"voice_conversion_models\":{\n        \"multilingual\":{\n            \"vctk\":{\n                \"freevc24\":{\n                    \"github_rls_url\": \"https://coqui.gateway.scarf.sh/v0.13.0_models/voice_conversion_models--multilingual--vctk--freevc24.zip\",\n                    \"description\": \"FreeVC model trained on VCTK dataset from https://github.com/OlaWod/FreeVC\",\n                    \"author\": \"Jing-Yi Li @OlaWod\",\n                    \"license\": \"MIT\",\n                    \"commit\": null\n                }\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "TTS/VERSION",
    "content": "0.12.0\n"
  },
  {
    "path": "TTS/__init__.py",
    "content": "import os\n\nwith open(os.path.join(os.path.dirname(__file__), \"VERSION\"), \"r\", encoding=\"utf-8\") as f:\n    version = f.read().strip()\n\n__version__ = version\n"
  },
  {
    "path": "TTS/api.py",
    "content": "import tempfile\nfrom pathlib import Path\n\nfrom TTS.utils.audio.numpy_transforms import save_wav\nfrom TTS.utils.manage import ModelManager\nfrom TTS.utils.synthesizer import Synthesizer\n\n\nclass TTS:\n    \"\"\"TODO: Add voice conversion and Capacitron support.\"\"\"\n\n    def __init__(\n        self,\n        model_name: str = None,\n        model_path: str = None,\n        config_path: str = None,\n        vocoder_path: str = None,\n        vocoder_config_path: str = None,\n        progress_bar: bool = True,\n        gpu=False,\n    ):\n        \"\"\"🐸TTS python interface that allows to load and use the released models.\n\n        Example with a multi-speaker model:\n            >>> from TTS.api import TTS\n            >>> tts = TTS(TTS.list_models()[0])\n            >>> wav = tts.tts(\"This is a test! This is also a test!!\", speaker=tts.speakers[0], language=tts.languages[0])\n            >>> tts.tts_to_file(text=\"Hello world!\", speaker=tts.speakers[0], language=tts.languages[0], file_path=\"output.wav\")\n\n        Example with a single-speaker model:\n            >>> tts = TTS(model_name=\"tts_models/de/thorsten/tacotron2-DDC\", progress_bar=False, gpu=False)\n            >>> tts.tts_to_file(text=\"Ich bin eine Testnachricht.\", file_path=\"output.wav\")\n\n        Example loading a model from a path:\n            >>> tts = TTS(model_path=\"/path/to/checkpoint_100000.pth\", config_path=\"/path/to/config.json\", progress_bar=False, gpu=False)\n            >>> tts.tts_to_file(text=\"Ich bin eine Testnachricht.\", file_path=\"output.wav\")\n\n        Example voice cloning with YourTTS in English, French and Portuguese:\n            >>> tts = TTS(model_name=\"tts_models/multilingual/multi-dataset/your_tts\", progress_bar=False, gpu=True)\n            >>> tts.tts_to_file(\"This is voice cloning.\", speaker_wav=\"my/cloning/audio.wav\", language=\"en\", file_path=\"thisisit.wav\")\n            >>> tts.tts_to_file(\"C'est le clonage de la voix.\", speaker_wav=\"my/cloning/audio.wav\", language=\"fr\", file_path=\"thisisit.wav\")\n            >>> tts.tts_to_file(\"Isso é clonagem de voz.\", speaker_wav=\"my/cloning/audio.wav\", language=\"pt\", file_path=\"thisisit.wav\")\n\n        Args:\n            model_name (str, optional): Model name to load. You can list models by ```tts.models```. Defaults to None.\n            model_path (str, optional): Path to the model checkpoint. Defaults to None.\n            config_path (str, optional): Path to the model config. Defaults to None.\n            vocoder_path (str, optional): Path to the vocoder checkpoint. Defaults to None.\n            vocoder_config_path (str, optional): Path to the vocoder config. Defaults to None.\n            progress_bar (bool, optional): Whether to pring a progress bar while downloading a model. Defaults to True.\n            gpu (bool, optional): Enable/disable GPU. Some models might be too slow on CPU. Defaults to False.\n        \"\"\"\n        self.manager = ModelManager(models_file=self.get_models_file_path(), progress_bar=progress_bar, verbose=False)\n\n        self.synthesizer = None\n        self.voice_converter = None\n\n        if model_name:\n            self.load_tts_model_by_name(model_name, gpu)\n        if model_path:\n            self.load_tts_model_by_path(\n                model_path, config_path, vocoder_path=vocoder_path, vocoder_config=vocoder_config_path, gpu=gpu\n            )\n\n    @property\n    def models(self):\n        return self.manager.list_tts_models()\n\n    @property\n    def is_multi_speaker(self):\n        if hasattr(self.synthesizer.tts_model, \"speaker_manager\") and self.synthesizer.tts_model.speaker_manager:\n            return self.synthesizer.tts_model.speaker_manager.num_speakers > 1\n        return False\n\n    @property\n    def is_multi_lingual(self):\n        if hasattr(self.synthesizer.tts_model, \"language_manager\") and self.synthesizer.tts_model.language_manager:\n            return self.synthesizer.tts_model.language_manager.num_languages > 1\n        return False\n\n    @property\n    def speakers(self):\n        if not self.is_multi_speaker:\n            return None\n        return self.synthesizer.tts_model.speaker_manager.speaker_names\n\n    @property\n    def languages(self):\n        if not self.is_multi_lingual:\n            return None\n        return self.synthesizer.tts_model.language_manager.language_names\n\n    @staticmethod\n    def get_models_file_path():\n        return Path(__file__).parent / \".models.json\"\n\n    @staticmethod\n    def list_models():\n        manager = ModelManager(models_file=TTS.get_models_file_path(), progress_bar=False, verbose=False)\n        return manager.list_tts_models()\n\n    def download_model_by_name(self, model_name: str):\n        model_path, config_path, model_item = self.manager.download_model(model_name)\n        if model_item.get(\"default_vocoder\") is None:\n            return model_path, config_path, None, None\n        vocoder_path, vocoder_config_path, _ = self.manager.download_model(model_item[\"default_vocoder\"])\n        return model_path, config_path, vocoder_path, vocoder_config_path\n\n    def load_vc_model_by_name(self, model_name: str, gpu: bool = False):\n        \"\"\"Load one of the voice conversion models by name.\n\n        Args:\n            model_name (str): Model name to load. You can list models by ```tts.models```.\n            gpu (bool, optional): Enable/disable GPU. Some models might be too slow on CPU. Defaults to False.\n        \"\"\"\n        model_path, config_path, _, _ = self.download_model_by_name(model_name)\n        self.voice_converter = Synthesizer(vc_checkpoint=model_path, vc_config=config_path, use_cuda=gpu)\n\n    def load_tts_model_by_name(self, model_name: str, gpu: bool = False):\n        \"\"\"Load one of 🐸TTS models by name.\n\n        Args:\n            model_name (str): Model name to load. You can list models by ```tts.models```.\n            gpu (bool, optional): Enable/disable GPU. Some models might be too slow on CPU. Defaults to False.\n\n        TODO: Add tests\n        \"\"\"\n\n        model_path, config_path, vocoder_path, vocoder_config_path = self.download_model_by_name(model_name)\n\n        # init synthesizer\n        # None values are fetch from the model\n        self.synthesizer = Synthesizer(\n            tts_checkpoint=model_path,\n            tts_config_path=config_path,\n            tts_speakers_file=None,\n            tts_languages_file=None,\n            vocoder_checkpoint=vocoder_path,\n            vocoder_config=vocoder_config_path,\n            encoder_checkpoint=None,\n            encoder_config=None,\n            use_cuda=gpu,\n        )\n\n    def load_tts_model_by_path(\n        self, model_path: str, config_path: str, vocoder_path: str = None, vocoder_config: str = None, gpu: bool = False\n    ):\n        \"\"\"Load a model from a path.\n\n        Args:\n            model_path (str): Path to the model checkpoint.\n            config_path (str): Path to the model config.\n            vocoder_path (str, optional): Path to the vocoder checkpoint. Defaults to None.\n            vocoder_config (str, optional): Path to the vocoder config. Defaults to None.\n            gpu (bool, optional): Enable/disable GPU. Some models might be too slow on CPU. Defaults to False.\n        \"\"\"\n\n        self.synthesizer = Synthesizer(\n            tts_checkpoint=model_path,\n            tts_config_path=config_path,\n            tts_speakers_file=None,\n            tts_languages_file=None,\n            vocoder_checkpoint=vocoder_path,\n            vocoder_config=vocoder_config,\n            encoder_checkpoint=None,\n            encoder_config=None,\n            use_cuda=gpu,\n        )\n\n    def _check_arguments(self, speaker: str = None, language: str = None, speaker_wav: str = None):\n        if self.is_multi_speaker and (speaker is None and speaker_wav is None):\n            raise ValueError(\"Model is multi-speaker but no speaker is provided.\")\n        if self.is_multi_lingual and language is None:\n            raise ValueError(\"Model is multi-lingual but no language is provided.\")\n        if not self.is_multi_speaker and speaker is not None:\n            raise ValueError(\"Model is not multi-speaker but speaker is provided.\")\n        if not self.is_multi_lingual and language is not None:\n            raise ValueError(\"Model is not multi-lingual but language is provided.\")\n\n    def tts(self, text: str, speaker: str = None, language: str = None, speaker_wav: str = None):\n        \"\"\"Convert text to speech.\n\n        Args:\n            text (str):\n                Input text to synthesize.\n            speaker (str, optional):\n                Speaker name for multi-speaker. You can check whether loaded model is multi-speaker by\n                `tts.is_multi_speaker` and list speakers by `tts.speakers`. Defaults to None.\n            language (str, optional):\n                Language code for multi-lingual models. You can check whether loaded model is multi-lingual\n                `tts.is_multi_lingual` and list available languages by `tts.languages`. Defaults to None.\n            speaker_wav (str, optional):\n                Path to a reference wav file to use for voice cloning with supporting models like YourTTS.\n                Defaults to None.\n        \"\"\"\n        self._check_arguments(speaker=speaker, language=language, speaker_wav=speaker_wav)\n\n        wav = self.synthesizer.tts(\n            text=text,\n            speaker_name=speaker,\n            language_name=language,\n            speaker_wav=speaker_wav,\n            reference_wav=None,\n            style_wav=None,\n            style_text=None,\n            reference_speaker_name=None,\n        )\n        return wav\n\n    def tts_to_file(\n        self,\n        text: str,\n        speaker: str = None,\n        language: str = None,\n        speaker_wav: str = None,\n        file_path: str = \"output.wav\",\n    ):\n        \"\"\"Convert text to speech.\n\n        Args:\n            text (str):\n                Input text to synthesize.\n            speaker (str, optional):\n                Speaker name for multi-speaker. You can check whether loaded model is multi-speaker by\n                `tts.is_multi_speaker` and list speakers by `tts.speakers`. Defaults to None.\n            language (str, optional):\n                Language code for multi-lingual models. You can check whether loaded model is multi-lingual\n                `tts.is_multi_lingual` and list available languages by `tts.languages`. Defaults to None.\n            speaker_wav (str, optional):\n                Path to a reference wav file to use for voice cloning with supporting models like YourTTS.\n                Defaults to None.\n            file_path (str, optional):\n                Output file path. Defaults to \"output.wav\".\n        \"\"\"\n        wav = self.tts(text=text, speaker=speaker, language=language, speaker_wav=speaker_wav)\n        self.synthesizer.save_wav(wav=wav, path=file_path)\n\n    def voice_conversion(\n        self,\n        sourve_wav: str,\n        target_wav: str,\n    ):\n        \"\"\"Voice conversion with FreeVC. Convert source wav to target speaker.\n\n        Args:\n            source_wav (str):\n                Path to the source wav file.\n            target_wav (str):\n                Path to the target wav file.\n        \"\"\"\n        wav = self.synthesizer.voice_conversion(source_wav=sourve_wav, target_wav=target_wav)\n        return wav\n\n    def tts_with_vc(self, text: str, language: str = None, speaker_wav: str = None):\n        \"\"\"Convert text to speech with voice conversion.\n\n        It combines tts with voice conversion to fake voice cloning.\n\n        - Convert text to speech with tts.\n        - Convert the output wav to target speaker with voice conversion.\n\n        Args:\n            text (str):\n                Input text to synthesize.\n            language (str, optional):\n                Language code for multi-lingual models. You can check whether loaded model is multi-lingual\n                `tts.is_multi_lingual` and list available languages by `tts.languages`. Defaults to None.\n            speaker_wav (str, optional):\n                Path to a reference wav file to use for voice cloning with supporting models like YourTTS.\n                Defaults to None.\n        \"\"\"\n        with tempfile.NamedTemporaryFile(suffix=\".wav\", delete=False) as fp:\n            # Lazy code... save it to a temp file to resample it while reading it for VC\n            self.tts_to_file(text=text, speaker=None, language=language, file_path=fp.name)\n        if self.voice_converter is None:\n            self.load_vc_model_by_name(\"voice_conversion_models/multilingual/vctk/freevc24\")\n        wav = self.voice_converter.voice_conversion(source_wav=fp.name, target_wav=speaker_wav)\n        return wav\n\n    def tts_with_vc_to_file(\n        self, text: str, language: str = None, speaker_wav: str = None, file_path: str = \"output.wav\"\n    ):\n        \"\"\"Convert text to speech with voice conversion and save to file.\n\n        Check `tts_with_vc` for more details.\n\n        Args:\n            text (str):\n                Input text to synthesize.\n            language (str, optional):\n                Language code for multi-lingual models. You can check whether loaded model is multi-lingual\n                `tts.is_multi_lingual` and list available languages by `tts.languages`. Defaults to None.\n            speaker_wav (str, optional):\n                Path to a reference wav file to use for voice cloning with supporting models like YourTTS.\n                Defaults to None.\n            file_path (str, optional):\n                Output file path. Defaults to \"output.wav\".\n        \"\"\"\n        wav = self.tts_with_vc(text=text, language=language, speaker_wav=speaker_wav)\n        save_wav(wav=wav, path=file_path, sample_rate=self.voice_converter.vc_config.audio.output_sample_rate)\n"
  },
  {
    "path": "TTS/bin/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/bin/collect_env_info.py",
    "content": "\"\"\"Get detailed info about the working environment.\"\"\"\nimport os\nimport platform\nimport sys\n\nimport numpy\nimport torch\n\nsys.path += [os.path.abspath(\"..\"), os.path.abspath(\".\")]\nimport json\n\nimport TTS\n\n\ndef system_info():\n    return {\n        \"OS\": platform.system(),\n        \"architecture\": platform.architecture(),\n        \"version\": platform.version(),\n        \"processor\": platform.processor(),\n        \"python\": platform.python_version(),\n    }\n\n\ndef cuda_info():\n    return {\n        \"GPU\": [torch.cuda.get_device_name(i) for i in range(torch.cuda.device_count())],\n        \"available\": torch.cuda.is_available(),\n        \"version\": torch.version.cuda,\n    }\n\n\ndef package_info():\n    return {\n        \"numpy\": numpy.__version__,\n        \"PyTorch_version\": torch.__version__,\n        \"PyTorch_debug\": torch.version.debug,\n        \"TTS\": TTS.__version__,\n    }\n\n\ndef main():\n    details = {\"System\": system_info(), \"CUDA\": cuda_info(), \"Packages\": package_info()}\n    print(json.dumps(details, indent=4, sort_keys=True))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "TTS/bin/compute_attention_masks.py",
    "content": "import argparse\nimport importlib\nimport os\nfrom argparse import RawTextHelpFormatter\n\nimport numpy as np\nimport torch\nfrom torch.utils.data import DataLoader\nfrom tqdm import tqdm\n\nfrom TTS.config import load_config\nfrom TTS.tts.datasets.TTSDataset import TTSDataset\nfrom TTS.tts.models import setup_model\nfrom TTS.tts.utils.text.characters import make_symbols, phonemes, symbols\nfrom TTS.utils.audio import AudioProcessor\nfrom TTS.utils.io import load_checkpoint\n\nif __name__ == \"__main__\":\n    # pylint: disable=bad-option-value\n    parser = argparse.ArgumentParser(\n        description=\"\"\"Extract attention masks from trained Tacotron/Tacotron2 models.\nThese masks can be used for different purposes including training a TTS model with a Duration Predictor.\\n\\n\"\"\"\n        \"\"\"Each attention mask is written to the same path as the input wav file with \".npy\" file extension.\n(e.g. path/bla.wav (wav file) --> path/bla.npy (attention mask))\\n\"\"\"\n        \"\"\"\nExample run:\n    CUDA_VISIBLE_DEVICE=\"0\" python TTS/bin/compute_attention_masks.py\n        --model_path /data/rw/home/Models/ljspeech-dcattn-December-14-2020_11+10AM-9d0e8c7/checkpoint_200000.pth\n        --config_path /data/rw/home/Models/ljspeech-dcattn-December-14-2020_11+10AM-9d0e8c7/config.json\n        --dataset_metafile metadata.csv\n        --data_path /root/LJSpeech-1.1/\n        --batch_size 32\n        --dataset ljspeech\n        --use_cuda True\n\"\"\",\n        formatter_class=RawTextHelpFormatter,\n    )\n    parser.add_argument(\"--model_path\", type=str, required=True, help=\"Path to Tacotron/Tacotron2 model file \")\n    parser.add_argument(\n        \"--config_path\",\n        type=str,\n        required=True,\n        help=\"Path to Tacotron/Tacotron2 config file.\",\n    )\n    parser.add_argument(\n        \"--dataset\",\n        type=str,\n        default=\"\",\n        required=True,\n        help=\"Target dataset processor name from TTS.tts.dataset.preprocess.\",\n    )\n\n    parser.add_argument(\n        \"--dataset_metafile\",\n        type=str,\n        default=\"\",\n        required=True,\n        help=\"Dataset metafile inclusing file paths with transcripts.\",\n    )\n    parser.add_argument(\"--data_path\", type=str, default=\"\", help=\"Defines the data path. It overwrites config.json.\")\n    parser.add_argument(\"--use_cuda\", type=bool, default=False, help=\"enable/disable cuda.\")\n\n    parser.add_argument(\n        \"--batch_size\", default=16, type=int, help=\"Batch size for the model. Use batch_size=1 if you have no CUDA.\"\n    )\n    args = parser.parse_args()\n\n    C = load_config(args.config_path)\n    ap = AudioProcessor(**C.audio)\n\n    # if the vocabulary was passed, replace the default\n    if \"characters\" in C.keys():\n        symbols, phonemes = make_symbols(**C.characters)\n\n    # load the model\n    num_chars = len(phonemes) if C.use_phonemes else len(symbols)\n    # TODO: handle multi-speaker\n    model = setup_model(C)\n    model, _ = load_checkpoint(model, args.model_path, args.use_cuda, True)\n\n    # data loader\n    preprocessor = importlib.import_module(\"TTS.tts.datasets.formatters\")\n    preprocessor = getattr(preprocessor, args.dataset)\n    meta_data = preprocessor(args.data_path, args.dataset_metafile)\n    dataset = TTSDataset(\n        model.decoder.r,\n        C.text_cleaner,\n        compute_linear_spec=False,\n        ap=ap,\n        meta_data=meta_data,\n        characters=C.characters if \"characters\" in C.keys() else None,\n        add_blank=C[\"add_blank\"] if \"add_blank\" in C.keys() else False,\n        use_phonemes=C.use_phonemes,\n        phoneme_cache_path=C.phoneme_cache_path,\n        phoneme_language=C.phoneme_language,\n        enable_eos_bos=C.enable_eos_bos_chars,\n    )\n\n    dataset.sort_and_filter_items(C.get(\"sort_by_audio_len\", default=False))\n    loader = DataLoader(\n        dataset,\n        batch_size=args.batch_size,\n        num_workers=4,\n        collate_fn=dataset.collate_fn,\n        shuffle=False,\n        drop_last=False,\n    )\n\n    # compute attentions\n    file_paths = []\n    with torch.no_grad():\n        for data in tqdm(loader):\n            # setup input data\n            text_input = data[0]\n            text_lengths = data[1]\n            linear_input = data[3]\n            mel_input = data[4]\n            mel_lengths = data[5]\n            stop_targets = data[6]\n            item_idxs = data[7]\n\n            # dispatch data to GPU\n            if args.use_cuda:\n                text_input = text_input.cuda()\n                text_lengths = text_lengths.cuda()\n                mel_input = mel_input.cuda()\n                mel_lengths = mel_lengths.cuda()\n\n            model_outputs = model.forward(text_input, text_lengths, mel_input)\n\n            alignments = model_outputs[\"alignments\"].detach()\n            for idx, alignment in enumerate(alignments):\n                item_idx = item_idxs[idx]\n                # interpolate if r > 1\n                alignment = (\n                    torch.nn.functional.interpolate(\n                        alignment.transpose(0, 1).unsqueeze(0),\n                        size=None,\n                        scale_factor=model.decoder.r,\n                        mode=\"nearest\",\n                        align_corners=None,\n                        recompute_scale_factor=None,\n                    )\n                    .squeeze(0)\n                    .transpose(0, 1)\n                )\n                # remove paddings\n                alignment = alignment[: mel_lengths[idx], : text_lengths[idx]].cpu().numpy()\n                # set file paths\n                wav_file_name = os.path.basename(item_idx)\n                align_file_name = os.path.splitext(wav_file_name)[0] + \"_attn.npy\"\n                file_path = item_idx.replace(wav_file_name, align_file_name)\n                # save output\n                wav_file_abs_path = os.path.abspath(item_idx)\n                file_abs_path = os.path.abspath(file_path)\n                file_paths.append([wav_file_abs_path, file_abs_path])\n                np.save(file_path, alignment)\n\n        # ourput metafile\n        metafile = os.path.join(args.data_path, \"metadata_attn_mask.txt\")\n\n        with open(metafile, \"w\", encoding=\"utf-8\") as f:\n            for p in file_paths:\n                f.write(f\"{p[0]}|{p[1]}\\n\")\n        print(f\" >> Metafile created: {metafile}\")\n"
  },
  {
    "path": "TTS/bin/compute_embeddings.py",
    "content": "import argparse\nimport os\nfrom argparse import RawTextHelpFormatter\n\nimport torch\nfrom tqdm import tqdm\n\nfrom TTS.config import load_config\nfrom TTS.config.shared_configs import BaseDatasetConfig\nfrom TTS.tts.datasets import load_tts_samples\nfrom TTS.tts.utils.managers import save_file\nfrom TTS.tts.utils.speakers import SpeakerManager\n\n\ndef compute_embeddings(\n    model_path,\n    config_path,\n    output_path,\n    old_spakers_file=None,\n    config_dataset_path=None,\n    formatter_name=None,\n    dataset_name=None,\n    dataset_path=None,\n    meta_file_train=None,\n    meta_file_val=None,\n    disable_cuda=False,\n    no_eval=False,\n):\n    use_cuda = torch.cuda.is_available() and not disable_cuda\n\n    if config_dataset_path is not None:\n        c_dataset = load_config(config_dataset_path)\n        meta_data_train, meta_data_eval = load_tts_samples(c_dataset.datasets, eval_split=not no_eval)\n    else:\n        c_dataset = BaseDatasetConfig()\n        c_dataset.formatter = formatter_name\n        c_dataset.dataset_name = dataset_name\n        c_dataset.path = dataset_path\n        if meta_file_train is not None:\n            c_dataset.meta_file_train = meta_file_train\n        if meta_file_val is not None:\n            c_dataset.meta_file_val = meta_file_val\n        meta_data_train, meta_data_eval = load_tts_samples(c_dataset, eval_split=not no_eval)\n\n    if meta_data_eval is None:\n        samples = meta_data_train\n    else:\n        samples = meta_data_train + meta_data_eval\n\n    encoder_manager = SpeakerManager(\n        encoder_model_path=model_path,\n        encoder_config_path=config_path,\n        d_vectors_file_path=old_spakers_file,\n        use_cuda=use_cuda,\n    )\n\n    class_name_key = encoder_manager.encoder_config.class_name_key\n\n    # compute speaker embeddings\n    speaker_mapping = {}\n    for fields in tqdm(samples):\n        class_name = fields[class_name_key]\n        audio_file = fields[\"audio_file\"]\n        embedding_key = fields[\"audio_unique_name\"]\n\n        if old_spakers_file is not None and embedding_key in encoder_manager.clip_ids:\n            # get the embedding from the old file\n            embedd = encoder_manager.get_embedding_by_clip(embedding_key)\n        else:\n            # extract the embedding\n            embedd = encoder_manager.compute_embedding_from_clip(audio_file)\n\n        # create speaker_mapping if target dataset is defined\n        speaker_mapping[embedding_key] = {}\n        speaker_mapping[embedding_key][\"name\"] = class_name\n        speaker_mapping[embedding_key][\"embedding\"] = embedd\n\n    if speaker_mapping:\n        # save speaker_mapping if target dataset is defined\n        if os.path.isdir(output_path):\n            mapping_file_path = os.path.join(output_path, \"speakers.pth\")\n        else:\n            mapping_file_path = output_path\n\n        if os.path.dirname(mapping_file_path) != \"\":\n            os.makedirs(os.path.dirname(mapping_file_path), exist_ok=True)\n\n        save_file(speaker_mapping, mapping_file_path)\n        print(\"Speaker embeddings saved at:\", mapping_file_path)\n\n\nif __name__ == \"__main__\":\n    parser = argparse.ArgumentParser(\n        description=\"\"\"Compute embedding vectors for each audio file in a dataset and store them keyed by `{dataset_name}#{file_path}` in a .pth file\\n\\n\"\"\"\n        \"\"\"\n        Example runs:\n        python TTS/bin/compute_embeddings.py --model_path speaker_encoder_model.pth --config_path speaker_encoder_config.json  --config_dataset_path dataset_config.json\n\n        python TTS/bin/compute_embeddings.py --model_path speaker_encoder_model.pth --config_path speaker_encoder_config.json  --formatter_name coqui --dataset_path /path/to/vctk/dataset --dataset_name my_vctk --meta_file_train /path/to/vctk/metafile_train.csv --meta_file_val /path/to/vctk/metafile_eval.csv\n        \"\"\",\n        formatter_class=RawTextHelpFormatter,\n    )\n    parser.add_argument(\n        \"--model_path\",\n        type=str,\n        help=\"Path to model checkpoint file. It defaults to the released speaker encoder.\",\n        default=\"https://github.com/coqui-ai/TTS/releases/download/speaker_encoder_model/model_se.pth.tar\",\n    )\n    parser.add_argument(\n        \"--config_path\",\n        type=str,\n        help=\"Path to model config file. It defaults to the released speaker encoder config.\",\n        default=\"https://github.com/coqui-ai/TTS/releases/download/speaker_encoder_model/config_se.json\",\n    )\n    parser.add_argument(\n        \"--config_dataset_path\",\n        type=str,\n        help=\"Path to dataset config file. You either need to provide this or `formatter_name`, `dataset_name` and `dataset_path` arguments.\",\n        default=None,\n    )\n    parser.add_argument(\"--output_path\", type=str, help=\"Path for output `pth` or `json` file.\", default=\"speakers.pth\")\n    parser.add_argument(\n        \"--old_file\", type=str, help=\"Previous embedding file to only compute new audios.\", default=None\n    )\n    parser.add_argument(\"--disable_cuda\", type=bool, help=\"Flag to disable cuda.\", default=False)\n    parser.add_argument(\"--no_eval\", type=bool, help=\"Do not compute eval?. Default False\", default=False)\n    parser.add_argument(\n        \"--formatter_name\",\n        type=str,\n        help=\"Name of the formatter to use. You either need to provide this or `config_dataset_path`\",\n        default=None,\n    )\n    parser.add_argument(\n        \"--dataset_name\",\n        type=str,\n        help=\"Name of the dataset to use. You either need to provide this or `config_dataset_path`\",\n        default=None,\n    )\n    parser.add_argument(\n        \"--dataset_path\",\n        type=str,\n        help=\"Path to the dataset. You either need to provide this or `config_dataset_path`\",\n        default=None,\n    )\n    parser.add_argument(\n        \"--meta_file_train\",\n        type=str,\n        help=\"Path to the train meta file. If not set, dataset formatter uses the default metafile if it is defined in the formatter. You either need to provide this or `config_dataset_path`\",\n        default=None,\n    )\n    parser.add_argument(\n        \"--meta_file_val\",\n        type=str,\n        help=\"Path to the evaluation meta file. If not set, dataset formatter uses the default metafile if it is defined in the formatter. You either need to provide this or `config_dataset_path`\",\n        default=None,\n    )\n    args = parser.parse_args()\n\n    compute_embeddings(\n        args.model_path,\n        args.config_path,\n        args.output_path,\n        old_spakers_file=args.old_file,\n        config_dataset_path=args.config_dataset_path,\n        formatter_name=args.formatter_name,\n        dataset_name=args.dataset_name,\n        dataset_path=args.dataset_path,\n        meta_file_train=args.meta_file_train,\n        meta_file_val=args.meta_file_val,\n        disable_cuda=args.disable_cuda,\n        no_eval=args.no_eval,\n    )\n"
  },
  {
    "path": "TTS/bin/compute_statistics.py",
    "content": "#!/usr/bin/env python3\n# -*- coding: utf-8 -*-\n\nimport argparse\nimport glob\nimport os\n\nimport numpy as np\nfrom tqdm import tqdm\n\n# from TTS.utils.io import load_config\nfrom TTS.config import load_config\nfrom TTS.tts.datasets import load_tts_samples\nfrom TTS.utils.audio import AudioProcessor\n\n\ndef main():\n    \"\"\"Run preprocessing process.\"\"\"\n    parser = argparse.ArgumentParser(description=\"Compute mean and variance of spectrogtram features.\")\n    parser.add_argument(\"config_path\", type=str, help=\"TTS config file path to define audio processin parameters.\")\n    parser.add_argument(\"out_path\", type=str, help=\"save path (directory and filename).\")\n    parser.add_argument(\n        \"--data_path\",\n        type=str,\n        required=False,\n        help=\"folder including the target set of wavs overriding dataset config.\",\n    )\n    args, overrides = parser.parse_known_args()\n\n    CONFIG = load_config(args.config_path)\n    CONFIG.parse_known_args(overrides, relaxed_parser=True)\n\n    # load config\n    CONFIG.audio.signal_norm = False  # do not apply earlier normalization\n    CONFIG.audio.stats_path = None  # discard pre-defined stats\n\n    # load audio processor\n    ap = AudioProcessor(**CONFIG.audio.to_dict())\n\n    # load the meta data of target dataset\n    if args.data_path:\n        dataset_items = glob.glob(os.path.join(args.data_path, \"**\", \"*.wav\"), recursive=True)\n    else:\n        dataset_items = load_tts_samples(CONFIG.datasets)[0]  # take only train data\n    print(f\" > There are {len(dataset_items)} files.\")\n\n    mel_sum = 0\n    mel_square_sum = 0\n    linear_sum = 0\n    linear_square_sum = 0\n    N = 0\n    for item in tqdm(dataset_items):\n        # compute features\n        wav = ap.load_wav(item if isinstance(item, str) else item[\"audio_file\"])\n        linear = ap.spectrogram(wav)\n        mel = ap.melspectrogram(wav)\n\n        # compute stats\n        N += mel.shape[1]\n        mel_sum += mel.sum(1)\n        linear_sum += linear.sum(1)\n        mel_square_sum += (mel**2).sum(axis=1)\n        linear_square_sum += (linear**2).sum(axis=1)\n\n    mel_mean = mel_sum / N\n    mel_scale = np.sqrt(mel_square_sum / N - mel_mean**2)\n    linear_mean = linear_sum / N\n    linear_scale = np.sqrt(linear_square_sum / N - linear_mean**2)\n\n    output_file_path = args.out_path\n    stats = {}\n    stats[\"mel_mean\"] = mel_mean\n    stats[\"mel_std\"] = mel_scale\n    stats[\"linear_mean\"] = linear_mean\n    stats[\"linear_std\"] = linear_scale\n\n    print(f\" > Avg mel spec mean: {mel_mean.mean()}\")\n    print(f\" > Avg mel spec scale: {mel_scale.mean()}\")\n    print(f\" > Avg linear spec mean: {linear_mean.mean()}\")\n    print(f\" > Avg linear spec scale: {linear_scale.mean()}\")\n\n    # set default config values for mean-var scaling\n    CONFIG.audio.stats_path = output_file_path\n    CONFIG.audio.signal_norm = True\n    # remove redundant values\n    del CONFIG.audio.max_norm\n    del CONFIG.audio.min_level_db\n    del CONFIG.audio.symmetric_norm\n    del CONFIG.audio.clip_norm\n    stats[\"audio_config\"] = CONFIG.audio.to_dict()\n    np.save(output_file_path, stats, allow_pickle=True)\n    print(f\" > stats saved to {output_file_path}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "TTS/bin/eval_encoder.py",
    "content": "import argparse\nfrom argparse import RawTextHelpFormatter\n\nimport torch\nfrom tqdm import tqdm\n\nfrom TTS.config import load_config\nfrom TTS.tts.datasets import load_tts_samples\nfrom TTS.tts.utils.speakers import SpeakerManager\n\n\ndef compute_encoder_accuracy(dataset_items, encoder_manager):\n    class_name_key = encoder_manager.encoder_config.class_name_key\n    map_classid_to_classname = getattr(encoder_manager.encoder_config, \"map_classid_to_classname\", None)\n\n    class_acc_dict = {}\n\n    # compute embeddings for all wav_files\n    for item in tqdm(dataset_items):\n        class_name = item[class_name_key]\n        wav_file = item[\"audio_file\"]\n\n        # extract the embedding\n        embedd = encoder_manager.compute_embedding_from_clip(wav_file)\n        if encoder_manager.encoder_criterion is not None and map_classid_to_classname is not None:\n            embedding = torch.FloatTensor(embedd).unsqueeze(0)\n            if encoder_manager.use_cuda:\n                embedding = embedding.cuda()\n\n            class_id = encoder_manager.encoder_criterion.softmax.inference(embedding).item()\n            predicted_label = map_classid_to_classname[str(class_id)]\n        else:\n            predicted_label = None\n\n        if class_name is not None and predicted_label is not None:\n            is_equal = int(class_name == predicted_label)\n            if class_name not in class_acc_dict:\n                class_acc_dict[class_name] = [is_equal]\n            else:\n                class_acc_dict[class_name].append(is_equal)\n        else:\n            raise RuntimeError(\"Error: class_name or/and predicted_label are None\")\n\n    acc_avg = 0\n    for key, values in class_acc_dict.items():\n        acc = sum(values) / len(values)\n        print(\"Class\", key, \"Accuracy:\", acc)\n        acc_avg += acc\n\n    print(\"Average Accuracy:\", acc_avg / len(class_acc_dict))\n\n\nif __name__ == \"__main__\":\n    parser = argparse.ArgumentParser(\n        description=\"\"\"Compute the accuracy of the encoder.\\n\\n\"\"\"\n        \"\"\"\n        Example runs:\n        python TTS/bin/eval_encoder.py emotion_encoder_model.pth emotion_encoder_config.json  dataset_config.json\n        \"\"\",\n        formatter_class=RawTextHelpFormatter,\n    )\n    parser.add_argument(\"model_path\", type=str, help=\"Path to model checkpoint file.\")\n    parser.add_argument(\n        \"config_path\",\n        type=str,\n        help=\"Path to model config file.\",\n    )\n\n    parser.add_argument(\n        \"config_dataset_path\",\n        type=str,\n        help=\"Path to dataset config file.\",\n    )\n    parser.add_argument(\"--use_cuda\", type=bool, help=\"flag to set cuda.\", default=True)\n    parser.add_argument(\"--eval\", type=bool, help=\"compute eval.\", default=True)\n\n    args = parser.parse_args()\n\n    c_dataset = load_config(args.config_dataset_path)\n\n    meta_data_train, meta_data_eval = load_tts_samples(c_dataset.datasets, eval_split=args.eval)\n    items = meta_data_train + meta_data_eval\n\n    enc_manager = SpeakerManager(\n        encoder_model_path=args.model_path, encoder_config_path=args.config_path, use_cuda=args.use_cuda\n    )\n\n    compute_encoder_accuracy(items, enc_manager)\n"
  },
  {
    "path": "TTS/bin/extract_tts_spectrograms.py",
    "content": "#!/usr/bin/env python3\n\"\"\"Extract Mel spectrograms with teacher forcing.\"\"\"\n\nimport argparse\nimport os\n\nimport numpy as np\nimport torch\nfrom torch.utils.data import DataLoader\nfrom tqdm import tqdm\n\nfrom TTS.config import load_config\nfrom TTS.tts.datasets import TTSDataset, load_tts_samples\nfrom TTS.tts.models import setup_model\nfrom TTS.tts.utils.speakers import SpeakerManager\nfrom TTS.tts.utils.text.tokenizer import TTSTokenizer\nfrom TTS.utils.audio import AudioProcessor\nfrom TTS.utils.generic_utils import count_parameters\n\nuse_cuda = torch.cuda.is_available()\n\n\ndef setup_loader(ap, r, verbose=False):\n    tokenizer, _ = TTSTokenizer.init_from_config(c)\n    dataset = TTSDataset(\n        outputs_per_step=r,\n        compute_linear_spec=False,\n        samples=meta_data,\n        tokenizer=tokenizer,\n        ap=ap,\n        batch_group_size=0,\n        min_text_len=c.min_text_len,\n        max_text_len=c.max_text_len,\n        min_audio_len=c.min_audio_len,\n        max_audio_len=c.max_audio_len,\n        phoneme_cache_path=c.phoneme_cache_path,\n        precompute_num_workers=0,\n        use_noise_augment=False,\n        verbose=verbose,\n        speaker_id_mapping=speaker_manager.name_to_id if c.use_speaker_embedding else None,\n        d_vector_mapping=speaker_manager.embeddings if c.use_d_vector_file else None,\n    )\n\n    if c.use_phonemes and c.compute_input_seq_cache:\n        # precompute phonemes to have a better estimate of sequence lengths.\n        dataset.compute_input_seq(c.num_loader_workers)\n    dataset.preprocess_samples()\n\n    loader = DataLoader(\n        dataset,\n        batch_size=c.batch_size,\n        shuffle=False,\n        collate_fn=dataset.collate_fn,\n        drop_last=False,\n        sampler=None,\n        num_workers=c.num_loader_workers,\n        pin_memory=False,\n    )\n    return loader\n\n\ndef set_filename(wav_path, out_path):\n    wav_file = os.path.basename(wav_path)\n    file_name = wav_file.split(\".\")[0]\n    os.makedirs(os.path.join(out_path, \"quant\"), exist_ok=True)\n    os.makedirs(os.path.join(out_path, \"mel\"), exist_ok=True)\n    os.makedirs(os.path.join(out_path, \"wav_gl\"), exist_ok=True)\n    os.makedirs(os.path.join(out_path, \"wav\"), exist_ok=True)\n    wavq_path = os.path.join(out_path, \"quant\", file_name)\n    mel_path = os.path.join(out_path, \"mel\", file_name)\n    wav_gl_path = os.path.join(out_path, \"wav_gl\", file_name + \".wav\")\n    wav_path = os.path.join(out_path, \"wav\", file_name + \".wav\")\n    return file_name, wavq_path, mel_path, wav_gl_path, wav_path\n\n\ndef format_data(data):\n    # setup input data\n    text_input = data[\"token_id\"]\n    text_lengths = data[\"token_id_lengths\"]\n    mel_input = data[\"mel\"]\n    mel_lengths = data[\"mel_lengths\"]\n    item_idx = data[\"item_idxs\"]\n    d_vectors = data[\"d_vectors\"]\n    speaker_ids = data[\"speaker_ids\"]\n    attn_mask = data[\"attns\"]\n    avg_text_length = torch.mean(text_lengths.float())\n    avg_spec_length = torch.mean(mel_lengths.float())\n\n    # dispatch data to GPU\n    if use_cuda:\n        text_input = text_input.cuda(non_blocking=True)\n        text_lengths = text_lengths.cuda(non_blocking=True)\n        mel_input = mel_input.cuda(non_blocking=True)\n        mel_lengths = mel_lengths.cuda(non_blocking=True)\n        if speaker_ids is not None:\n            speaker_ids = speaker_ids.cuda(non_blocking=True)\n        if d_vectors is not None:\n            d_vectors = d_vectors.cuda(non_blocking=True)\n        if attn_mask is not None:\n            attn_mask = attn_mask.cuda(non_blocking=True)\n    return (\n        text_input,\n        text_lengths,\n        mel_input,\n        mel_lengths,\n        speaker_ids,\n        d_vectors,\n        avg_text_length,\n        avg_spec_length,\n        attn_mask,\n        item_idx,\n    )\n\n\n@torch.no_grad()\ndef inference(\n    model_name,\n    model,\n    ap,\n    text_input,\n    text_lengths,\n    mel_input,\n    mel_lengths,\n    speaker_ids=None,\n    d_vectors=None,\n):\n    if model_name == \"glow_tts\":\n        speaker_c = None\n        if speaker_ids is not None:\n            speaker_c = speaker_ids\n        elif d_vectors is not None:\n            speaker_c = d_vectors\n        outputs = model.inference_with_MAS(\n            text_input,\n            text_lengths,\n            mel_input,\n            mel_lengths,\n            aux_input={\"d_vectors\": speaker_c, \"speaker_ids\": speaker_ids},\n        )\n        model_output = outputs[\"model_outputs\"]\n        model_output = model_output.detach().cpu().numpy()\n\n    elif \"tacotron\" in model_name:\n        aux_input = {\"speaker_ids\": speaker_ids, \"d_vectors\": d_vectors}\n        outputs = model(text_input, text_lengths, mel_input, mel_lengths, aux_input)\n        postnet_outputs = outputs[\"model_outputs\"]\n        # normalize tacotron output\n        if model_name == \"tacotron\":\n            mel_specs = []\n            postnet_outputs = postnet_outputs.data.cpu().numpy()\n            for b in range(postnet_outputs.shape[0]):\n                postnet_output = postnet_outputs[b]\n                mel_specs.append(torch.FloatTensor(ap.out_linear_to_mel(postnet_output.T).T))\n            model_output = torch.stack(mel_specs).cpu().numpy()\n\n        elif model_name == \"tacotron2\":\n            model_output = postnet_outputs.detach().cpu().numpy()\n    return model_output\n\n\ndef extract_spectrograms(\n    data_loader, model, ap, output_path, quantized_wav=False, save_audio=False, debug=False, metada_name=\"metada.txt\"\n):\n    model.eval()\n    export_metadata = []\n    for _, data in tqdm(enumerate(data_loader), total=len(data_loader)):\n        # format data\n        (\n            text_input,\n            text_lengths,\n            mel_input,\n            mel_lengths,\n            speaker_ids,\n            d_vectors,\n            _,\n            _,\n            _,\n            item_idx,\n        ) = format_data(data)\n\n        model_output = inference(\n            c.model.lower(),\n            model,\n            ap,\n            text_input,\n            text_lengths,\n            mel_input,\n            mel_lengths,\n            speaker_ids,\n            d_vectors,\n        )\n\n        for idx in range(text_input.shape[0]):\n            wav_file_path = item_idx[idx]\n            wav = ap.load_wav(wav_file_path)\n            _, wavq_path, mel_path, wav_gl_path, wav_path = set_filename(wav_file_path, output_path)\n\n            # quantize and save wav\n            if quantized_wav:\n                wavq = ap.quantize(wav)\n                np.save(wavq_path, wavq)\n\n            # save TTS mel\n            mel = model_output[idx]\n            mel_length = mel_lengths[idx]\n            mel = mel[:mel_length, :].T\n            np.save(mel_path, mel)\n\n            export_metadata.append([wav_file_path, mel_path])\n            if save_audio:\n                ap.save_wav(wav, wav_path)\n\n            if debug:\n                print(\"Audio for debug saved at:\", wav_gl_path)\n                wav = ap.inv_melspectrogram(mel)\n                ap.save_wav(wav, wav_gl_path)\n\n    with open(os.path.join(output_path, metada_name), \"w\", encoding=\"utf-8\") as f:\n        for data in export_metadata:\n            f.write(f\"{data[0]}|{data[1]+'.npy'}\\n\")\n\n\ndef main(args):  # pylint: disable=redefined-outer-name\n    # pylint: disable=global-variable-undefined\n    global meta_data, speaker_manager\n\n    # Audio processor\n    ap = AudioProcessor(**c.audio)\n\n    # load data instances\n    meta_data_train, meta_data_eval = load_tts_samples(\n        c.datasets, eval_split=args.eval, eval_split_max_size=c.eval_split_max_size, eval_split_size=c.eval_split_size\n    )\n\n    # use eval and training partitions\n    meta_data = meta_data_train + meta_data_eval\n\n    # init speaker manager\n    if c.use_speaker_embedding:\n        speaker_manager = SpeakerManager(data_items=meta_data)\n    elif c.use_d_vector_file:\n        speaker_manager = SpeakerManager(d_vectors_file_path=c.d_vector_file)\n    else:\n        speaker_manager = None\n\n    # setup model\n    model = setup_model(c)\n\n    # restore model\n    model.load_checkpoint(c, args.checkpoint_path, eval=True)\n\n    if use_cuda:\n        model.cuda()\n\n    num_params = count_parameters(model)\n    print(\"\\n > Model has {} parameters\".format(num_params), flush=True)\n    # set r\n    r = 1 if c.model.lower() == \"glow_tts\" else model.decoder.r\n    own_loader = setup_loader(ap, r, verbose=True)\n\n    extract_spectrograms(\n        own_loader,\n        model,\n        ap,\n        args.output_path,\n        quantized_wav=args.quantized,\n        save_audio=args.save_audio,\n        debug=args.debug,\n        metada_name=\"metada.txt\",\n    )\n\n\nif __name__ == \"__main__\":\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--config_path\", type=str, help=\"Path to config file for training.\", required=True)\n    parser.add_argument(\"--checkpoint_path\", type=str, help=\"Model file to be restored.\", required=True)\n    parser.add_argument(\"--output_path\", type=str, help=\"Path to save mel specs\", required=True)\n    parser.add_argument(\"--debug\", default=False, action=\"store_true\", help=\"Save audio files for debug\")\n    parser.add_argument(\"--save_audio\", default=False, action=\"store_true\", help=\"Save audio files\")\n    parser.add_argument(\"--quantized\", action=\"store_true\", help=\"Save quantized audio files\")\n    parser.add_argument(\"--eval\", type=bool, help=\"compute eval.\", default=True)\n    args = parser.parse_args()\n\n    c = load_config(args.config_path)\n    c.audio.trim_silence = False\n    main(args)\n"
  },
  {
    "path": "TTS/bin/find_unique_chars.py",
    "content": "\"\"\"Find all the unique characters in a dataset\"\"\"\nimport argparse\nfrom argparse import RawTextHelpFormatter\n\nfrom TTS.config import load_config\nfrom TTS.tts.datasets import load_tts_samples\n\n\ndef main():\n    # pylint: disable=bad-option-value\n    parser = argparse.ArgumentParser(\n        description=\"\"\"Find all the unique characters or phonemes in a dataset.\\n\\n\"\"\"\n        \"\"\"\n    Example runs:\n\n    python TTS/bin/find_unique_chars.py --config_path config.json\n    \"\"\",\n        formatter_class=RawTextHelpFormatter,\n    )\n    parser.add_argument(\"--config_path\", type=str, help=\"Path to dataset config file.\", required=True)\n    args = parser.parse_args()\n\n    c = load_config(args.config_path)\n\n    # load all datasets\n    train_items, eval_items = load_tts_samples(\n        c.datasets, eval_split=True, eval_split_max_size=c.eval_split_max_size, eval_split_size=c.eval_split_size\n    )\n\n    items = train_items + eval_items\n\n    texts = \"\".join(item[\"text\"] for item in items)\n    chars = set(texts)\n    lower_chars = filter(lambda c: c.islower(), chars)\n    chars_force_lower = [c.lower() for c in chars]\n    chars_force_lower = set(chars_force_lower)\n\n    print(f\" > Number of unique characters: {len(chars)}\")\n    print(f\" > Unique characters: {''.join(sorted(chars))}\")\n    print(f\" > Unique lower characters: {''.join(sorted(lower_chars))}\")\n    print(f\" > Unique all forced to lower characters: {''.join(sorted(chars_force_lower))}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "TTS/bin/find_unique_phonemes.py",
    "content": "\"\"\"Find all the unique characters in a dataset\"\"\"\nimport argparse\nimport multiprocessing\nfrom argparse import RawTextHelpFormatter\n\nfrom tqdm.contrib.concurrent import process_map\n\nfrom TTS.config import load_config\nfrom TTS.tts.datasets import load_tts_samples\nfrom TTS.tts.utils.text.phonemizers import Gruut\n\n\ndef compute_phonemes(item):\n    text = item[\"text\"]\n    ph = phonemizer.phonemize(text).replace(\"|\", \"\")\n    return set(list(ph))\n\n\ndef main():\n    # pylint: disable=W0601\n    global c, phonemizer\n    # pylint: disable=bad-option-value\n    parser = argparse.ArgumentParser(\n        description=\"\"\"Find all the unique characters or phonemes in a dataset.\\n\\n\"\"\"\n        \"\"\"\n    Example runs:\n\n    python TTS/bin/find_unique_phonemes.py --config_path config.json\n    \"\"\",\n        formatter_class=RawTextHelpFormatter,\n    )\n    parser.add_argument(\"--config_path\", type=str, help=\"Path to dataset config file.\", required=True)\n    args = parser.parse_args()\n\n    c = load_config(args.config_path)\n\n    # load all datasets\n    train_items, eval_items = load_tts_samples(\n        c.datasets, eval_split=True, eval_split_max_size=c.eval_split_max_size, eval_split_size=c.eval_split_size\n    )\n    items = train_items + eval_items\n    print(\"Num items:\", len(items))\n\n    language_list = [item[\"language\"] for item in items]\n    is_lang_def = all(language_list)\n\n    if not c.phoneme_language or not is_lang_def:\n        raise ValueError(\"Phoneme language must be defined in config.\")\n\n    if not language_list.count(language_list[0]) == len(language_list):\n        raise ValueError(\n            \"Currently, just one phoneme language per config file is supported !! Please split the dataset config into different configs and run it individually for each language !!\"\n        )\n\n    phonemizer = Gruut(language=language_list[0], keep_puncs=True)\n\n    phonemes = process_map(compute_phonemes, items, max_workers=multiprocessing.cpu_count(), chunksize=15)\n    phones = []\n    for ph in phonemes:\n        phones.extend(ph)\n\n    phones = set(phones)\n    lower_phones = filter(lambda c: c.islower(), phones)\n    phones_force_lower = [c.lower() for c in phones]\n    phones_force_lower = set(phones_force_lower)\n\n    print(f\" > Number of unique phonemes: {len(phones)}\")\n    print(f\" > Unique phonemes: {''.join(sorted(phones))}\")\n    print(f\" > Unique lower phonemes: {''.join(sorted(lower_phones))}\")\n    print(f\" > Unique all forced to lower phonemes: {''.join(sorted(phones_force_lower))}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "TTS/bin/remove_silence_using_vad.py",
    "content": "import argparse\nimport glob\nimport os\nimport pathlib\n\nfrom tqdm import tqdm\n\nfrom TTS.utils.vad import get_vad_model_and_utils, remove_silence\n\n\ndef adjust_path_and_remove_silence(audio_path):\n    output_path = audio_path.replace(os.path.join(args.input_dir, \"\"), os.path.join(args.output_dir, \"\"))\n    # ignore if the file exists\n    if os.path.exists(output_path) and not args.force:\n        return output_path\n\n    # create all directory structure\n    pathlib.Path(output_path).parent.mkdir(parents=True, exist_ok=True)\n    # remove the silence and save the audio\n    output_path, is_speech = remove_silence(\n        model_and_utils,\n        audio_path,\n        output_path,\n        trim_just_beginning_and_end=args.trim_just_beginning_and_end,\n        use_cuda=args.use_cuda,\n    )\n\n    return output_path, is_speech\n\n\ndef preprocess_audios():\n    files = sorted(glob.glob(os.path.join(args.input_dir, args.glob), recursive=True))\n    print(\"> Number of files: \", len(files))\n    if not args.force:\n        print(\"> Ignoring files that already exist in the output idrectory.\")\n\n    if args.trim_just_beginning_and_end:\n        print(\"> Trimming just the beginning and the end with nonspeech parts.\")\n    else:\n        print(\"> Trimming all nonspeech parts.\")\n\n    filtered_files = []\n    if files:\n        # create threads\n        # num_threads = multiprocessing.cpu_count()\n        # process_map(adjust_path_and_remove_silence, files, max_workers=num_threads, chunksize=15)\n        for f in tqdm(files):\n            output_path, is_speech = adjust_path_and_remove_silence(f)\n            if not is_speech:\n                filtered_files.append(output_path)\n\n        # write files that do not have speech\n        with open(os.path.join(args.output_dir, \"filtered_files.txt\"), \"w\", encoding=\"utf-8\") as f:\n            for file in filtered_files:\n                f.write(file + \"\\n\")\n    else:\n        print(\"> No files Found !\")\n\n\nif __name__ == \"__main__\":\n    parser = argparse.ArgumentParser(\n        description=\"python TTS/bin/remove_silence_using_vad.py -i=VCTK-Corpus/ -o=VCTK-Corpus-removed-silence/ -g=wav48_silence_trimmed/*/*_mic1.flac --trim_just_beginning_and_end True\"\n    )\n    parser.add_argument(\"-i\", \"--input_dir\", type=str, default=\"../VCTK-Corpus\", help=\"Dataset root dir\")\n    parser.add_argument(\n        \"-o\", \"--output_dir\", type=str, default=\"../VCTK-Corpus-removed-silence\", help=\"Output Dataset dir\"\n    )\n    parser.add_argument(\"-f\", \"--force\", default=False, action=\"store_true\", help=\"Force the replace of exists files\")\n    parser.add_argument(\n        \"-g\",\n        \"--glob\",\n        type=str,\n        default=\"**/*.wav\",\n        help=\"path in glob format for acess wavs from input_dir. ex: wav48/*/*.wav\",\n    )\n    parser.add_argument(\n        \"-t\",\n        \"--trim_just_beginning_and_end\",\n        type=bool,\n        default=True,\n        help=\"If True this script will trim just the beginning and end nonspeech parts. If False all nonspeech parts will be trim. Default True\",\n    )\n    parser.add_argument(\n        \"-c\",\n        \"--use_cuda\",\n        type=bool,\n        default=False,\n        help=\"If True use cuda\",\n    )\n    args = parser.parse_args()\n    # load the model and utils\n    model_and_utils = get_vad_model_and_utils(use_cuda=args.use_cuda)\n    preprocess_audios()\n"
  },
  {
    "path": "TTS/bin/resample.py",
    "content": "import argparse\nimport glob\nimport os\nfrom argparse import RawTextHelpFormatter\nfrom multiprocessing import Pool\nfrom shutil import copytree\n\nimport librosa\nimport soundfile as sf\nfrom tqdm import tqdm\n\n\ndef resample_file(func_args):\n    filename, output_sr = func_args\n    y, sr = librosa.load(filename, sr=output_sr)\n    sf.write(filename, y, sr)\n\n\ndef resample_files(input_dir, output_sr, output_dir=None, file_ext=\"wav\", n_jobs=10):\n    if output_dir:\n        print(\"Recursively copying the input folder...\")\n        copytree(input_dir, output_dir)\n        input_dir = output_dir\n\n    print(\"Resampling the audio files...\")\n    audio_files = glob.glob(os.path.join(input_dir, f\"**/*.{file_ext}\"), recursive=True)\n    print(f\"Found {len(audio_files)} files...\")\n    audio_files = list(zip(audio_files, len(audio_files) * [output_sr]))\n    with Pool(processes=n_jobs) as p:\n        with tqdm(total=len(audio_files)) as pbar:\n            for _, _ in enumerate(p.imap_unordered(resample_file, audio_files)):\n                pbar.update()\n\n    print(\"Done !\")\n\n\nif __name__ == \"__main__\":\n    parser = argparse.ArgumentParser(\n        description=\"\"\"Resample a folder recusively with librosa\n                       Can be used in place or create a copy of the folder as an output.\\n\\n\n                       Example run:\n                            python TTS/bin/resample.py\n                                --input_dir /root/LJSpeech-1.1/\n                                --output_sr 22050\n                                --output_dir /root/resampled_LJSpeech-1.1/\n                                --file_ext wav\n                                --n_jobs 24\n                    \"\"\",\n        formatter_class=RawTextHelpFormatter,\n    )\n\n    parser.add_argument(\n        \"--input_dir\",\n        type=str,\n        default=None,\n        required=True,\n        help=\"Path of the folder containing the audio files to resample\",\n    )\n\n    parser.add_argument(\n        \"--output_sr\",\n        type=int,\n        default=22050,\n        required=False,\n        help=\"Samlple rate to which the audio files should be resampled\",\n    )\n\n    parser.add_argument(\n        \"--output_dir\",\n        type=str,\n        default=None,\n        required=False,\n        help=\"Path of the destination folder. If not defined, the operation is done in place\",\n    )\n\n    parser.add_argument(\n        \"--file_ext\",\n        type=str,\n        default=\"wav\",\n        required=False,\n        help=\"Extension of the audio files to resample\",\n    )\n\n    parser.add_argument(\n        \"--n_jobs\", type=int, default=None, help=\"Number of threads to use, by default it uses all cores\"\n    )\n\n    args = parser.parse_args()\n\n    resample_files(args.input_dir, args.output_sr, args.output_dir, args.file_ext, args.n_jobs)\n"
  },
  {
    "path": "TTS/bin/synthesize.py",
    "content": "#!/usr/bin/env python3\n# -*- coding: utf-8 -*-\n\nimport argparse\nimport sys\nfrom argparse import RawTextHelpFormatter\n\n# pylint: disable=redefined-outer-name, unused-argument\nfrom pathlib import Path\n\nfrom TTS.utils.manage import ModelManager\nfrom TTS.utils.synthesizer import Synthesizer\n\n\ndef str2bool(v):\n    if isinstance(v, bool):\n        return v\n    if v.lower() in (\"yes\", \"true\", \"t\", \"y\", \"1\"):\n        return True\n    if v.lower() in (\"no\", \"false\", \"f\", \"n\", \"0\"):\n        return False\n    raise argparse.ArgumentTypeError(\"Boolean value expected.\")\n\n\ndef main():\n    description = \"\"\"Synthesize speech on command line.\n\nYou can either use your trained model or choose a model from the provided list.\n\nIf you don't specify any models, then it uses LJSpeech based English model.\n\n## Example Runs\n\n### Single Speaker Models\n\n- List provided models:\n\n    ```\n    $ tts --list_models\n    ```\n\n- Query info for model info by idx:\n\n    ```\n    $ tts --model_info_by_idx \"<model_type>/<model_query_idx>\"\n    ```\n\n- Query info for model info by full name:\n\n    ```\n    $ tts --model_info_by_name \"<model_type>/<language>/<dataset>/<model_name>\"\n    ```\n\n- Run TTS with default models:\n\n    ```\n    $ tts --text \"Text for TTS\"\n    ```\n\n- Run a TTS model with its default vocoder model:\n\n    ```\n    $ tts --text \"Text for TTS\" --model_name \"<model_type>/<language>/<dataset>/<model_name>\n    ```\n\n- Run with specific TTS and vocoder models from the list:\n\n    ```\n    $ tts --text \"Text for TTS\" --model_name \"<model_type>/<language>/<dataset>/<model_name>\" --vocoder_name \"<model_type>/<language>/<dataset>/<model_name>\" --output_path\n    ```\n\n- Run your own TTS model (Using Griffin-Lim Vocoder):\n\n    ```\n    $ tts --text \"Text for TTS\" --model_path path/to/model.pth --config_path path/to/config.json --out_path output/path/speech.wav\n    ```\n\n- Run your own TTS and Vocoder models:\n    ```\n    $ tts --text \"Text for TTS\" --model_path path/to/config.json --config_path path/to/model.pth --out_path output/path/speech.wav\n        --vocoder_path path/to/vocoder.pth --vocoder_config_path path/to/vocoder_config.json\n    ```\n\n### Multi-speaker Models\n\n- List the available speakers and choose as <speaker_id> among them:\n\n    ```\n    $ tts --model_name \"<language>/<dataset>/<model_name>\"  --list_speaker_idxs\n    ```\n\n- Run the multi-speaker TTS model with the target speaker ID:\n\n    ```\n    $ tts --text \"Text for TTS.\" --out_path output/path/speech.wav --model_name \"<language>/<dataset>/<model_name>\"  --speaker_idx <speaker_id>\n    ```\n\n- Run your own multi-speaker TTS model:\n\n    ```\n    $ tts --text \"Text for TTS\" --out_path output/path/speech.wav --model_path path/to/config.json --config_path path/to/model.pth --speakers_file_path path/to/speaker.json --speaker_idx <speaker_id>\n    ```\n\n### Voice Conversion Models\n\n    ```\n    $ tts --out_path output/path/speech.wav --model_name \"<language>/<dataset>/<model_name>\" --source_wav <path/to/speaker/wav> --target_wav <path/to/reference/wav>\n    ```\n    \"\"\"\n    # We remove Markdown code formatting programmatically here to allow us to copy-and-paste from main README to keep\n    # documentation in sync more easily.\n    parser = argparse.ArgumentParser(\n        description=description.replace(\"    ```\\n\", \"\"),\n        formatter_class=RawTextHelpFormatter,\n    )\n\n    parser.add_argument(\n        \"--list_models\",\n        type=str2bool,\n        nargs=\"?\",\n        const=True,\n        default=False,\n        help=\"list available pre-trained TTS and vocoder models.\",\n    )\n\n    parser.add_argument(\n        \"--model_info_by_idx\",\n        type=str,\n        default=None,\n        help=\"model info using query format: <model_type>/<model_query_idx>\",\n    )\n\n    parser.add_argument(\n        \"--model_info_by_name\",\n        type=str,\n        default=None,\n        help=\"model info using query format: <model_type>/<language>/<dataset>/<model_name>\",\n    )\n\n    parser.add_argument(\"--text\", type=str, default=None, help=\"Text to generate speech.\")\n\n    # Args for running pre-trained TTS models.\n    parser.add_argument(\n        \"--model_name\",\n        type=str,\n        default=\"tts_models/en/ljspeech/tacotron2-DDC\",\n        help=\"Name of one of the pre-trained TTS models in format <language>/<dataset>/<model_name>\",\n    )\n    parser.add_argument(\n        \"--vocoder_name\",\n        type=str,\n        default=None,\n        help=\"Name of one of the pre-trained  vocoder models in format <language>/<dataset>/<model_name>\",\n    )\n\n    # Args for running custom models\n    parser.add_argument(\"--config_path\", default=None, type=str, help=\"Path to model config file.\")\n    parser.add_argument(\n        \"--model_path\",\n        type=str,\n        default=None,\n        help=\"Path to model file.\",\n    )\n    parser.add_argument(\n        \"--out_path\",\n        type=str,\n        default=\"tts_output.wav\",\n        help=\"Output wav file path.\",\n    )\n    parser.add_argument(\"--use_cuda\", type=bool, help=\"Run model on CUDA.\", default=False)\n    parser.add_argument(\n        \"--vocoder_path\",\n        type=str,\n        help=\"Path to vocoder model file. If it is not defined, model uses GL as vocoder. Please make sure that you installed vocoder library before (WaveRNN).\",\n        default=None,\n    )\n    parser.add_argument(\"--vocoder_config_path\", type=str, help=\"Path to vocoder model config file.\", default=None)\n    parser.add_argument(\n        \"--encoder_path\",\n        type=str,\n        help=\"Path to speaker encoder model file.\",\n        default=None,\n    )\n    parser.add_argument(\"--encoder_config_path\", type=str, help=\"Path to speaker encoder config file.\", default=None)\n\n    # args for multi-speaker synthesis\n    parser.add_argument(\"--speakers_file_path\", type=str, help=\"JSON file for multi-speaker model.\", default=None)\n    parser.add_argument(\"--language_ids_file_path\", type=str, help=\"JSON file for multi-lingual model.\", default=None)\n    parser.add_argument(\n        \"--speaker_idx\",\n        type=str,\n        help=\"Target speaker ID for a multi-speaker TTS model.\",\n        default=None,\n    )\n    parser.add_argument(\n        \"--language_idx\",\n        type=str,\n        help=\"Target language ID for a multi-lingual TTS model.\",\n        default=None,\n    )\n    parser.add_argument(\n        \"--speaker_wav\",\n        nargs=\"+\",\n        help=\"wav file(s) to condition a multi-speaker TTS model with a Speaker Encoder. You can give multiple file paths. The d_vectors is computed as their average.\",\n        default=None,\n    )\n    parser.add_argument(\"--gst_style\", help=\"Wav path file for GST style reference.\", default=None)\n    parser.add_argument(\n        \"--capacitron_style_wav\", type=str, help=\"Wav path file for Capacitron prosody reference.\", default=None\n    )\n    parser.add_argument(\"--capacitron_style_text\", type=str, help=\"Transcription of the reference.\", default=None)\n    parser.add_argument(\n        \"--list_speaker_idxs\",\n        help=\"List available speaker ids for the defined multi-speaker model.\",\n        type=str2bool,\n        nargs=\"?\",\n        const=True,\n        default=False,\n    )\n    parser.add_argument(\n        \"--list_language_idxs\",\n        help=\"List available language ids for the defined multi-lingual model.\",\n        type=str2bool,\n        nargs=\"?\",\n        const=True,\n        default=False,\n    )\n    # aux args\n    parser.add_argument(\n        \"--save_spectogram\",\n        type=bool,\n        help=\"If true save raw spectogram for further (vocoder) processing in out_path.\",\n        default=False,\n    )\n    parser.add_argument(\n        \"--reference_wav\",\n        type=str,\n        help=\"Reference wav file to convert in the voice of the speaker_idx or speaker_wav\",\n        default=None,\n    )\n    parser.add_argument(\n        \"--reference_speaker_idx\",\n        type=str,\n        help=\"speaker ID of the reference_wav speaker (If not provided the embedding will be computed using the Speaker Encoder).\",\n        default=None,\n    )\n    parser.add_argument(\n        \"--progress_bar\",\n        type=str2bool,\n        help=\"If true shows a progress bar for the model download. Defaults to True\",\n        default=True,\n    )\n\n    # voice conversion args\n    parser.add_argument(\n        \"--source_wav\",\n        type=str,\n        default=None,\n        help=\"Original audio file to convert in the voice of the target_wav\",\n    )\n    parser.add_argument(\n        \"--target_wav\",\n        type=str,\n        default=None,\n        help=\"Target audio file to convert in the voice of the source_wav\",\n    )\n\n    args = parser.parse_args()\n\n    # print the description if either text or list_models is not set\n    check_args = [\n        args.text,\n        args.list_models,\n        args.list_speaker_idxs,\n        args.list_language_idxs,\n        args.reference_wav,\n        args.model_info_by_idx,\n        args.model_info_by_name,\n        args.source_wav,\n        args.target_wav,\n    ]\n    if not any(check_args):\n        parser.parse_args([\"-h\"])\n\n    # load model manager\n    path = Path(__file__).parent / \"../.models.json\"\n    manager = ModelManager(path, progress_bar=args.progress_bar)\n\n    tts_path = None\n    tts_config_path = None\n    speakers_file_path = None\n    language_ids_file_path = None\n    vocoder_path = None\n    vocoder_config_path = None\n    encoder_path = None\n    encoder_config_path = None\n    vc_path = None\n    vc_config_path = None\n\n    # CASE1 #list : list pre-trained TTS models\n    if args.list_models:\n        manager.list_models()\n        sys.exit()\n\n    # CASE2 #info : model info for pre-trained TTS models\n    if args.model_info_by_idx:\n        model_query = args.model_info_by_idx\n        manager.model_info_by_idx(model_query)\n        sys.exit()\n\n    if args.model_info_by_name:\n        model_query_full_name = args.model_info_by_name\n        manager.model_info_by_full_name(model_query_full_name)\n        sys.exit()\n\n    # CASE3: load pre-trained model paths\n    if args.model_name is not None and not args.model_path:\n        model_path, config_path, model_item = manager.download_model(args.model_name)\n\n        # tts model\n        if model_item[\"model_type\"] == \"tts_models\":\n            tts_path = model_path\n            tts_config_path = config_path\n            if \"default_vocoder\" in model_item:\n                args.vocoder_name = model_item[\"default_vocoder\"] if args.vocoder_name is None else args.vocoder_name\n\n        # voice conversion model\n        if model_item[\"model_type\"] == \"voice_conversion_models\":\n            vc_path = model_path\n            vc_config_path = config_path\n\n    # load vocoder\n    if args.vocoder_name is not None and not args.vocoder_path:\n        vocoder_path, vocoder_config_path, _ = manager.download_model(args.vocoder_name)\n\n    # CASE4: set custom model paths\n    if args.model_path is not None:\n        tts_path = args.model_path\n        tts_config_path = args.config_path\n        speakers_file_path = args.speakers_file_path\n        language_ids_file_path = args.language_ids_file_path\n\n    if args.vocoder_path is not None:\n        vocoder_path = args.vocoder_path\n        vocoder_config_path = args.vocoder_config_path\n\n    if args.encoder_path is not None:\n        encoder_path = args.encoder_path\n        encoder_config_path = args.encoder_config_path\n\n    # load models\n    synthesizer = Synthesizer(\n        tts_path,\n        tts_config_path,\n        speakers_file_path,\n        language_ids_file_path,\n        vocoder_path,\n        vocoder_config_path,\n        encoder_path,\n        encoder_config_path,\n        vc_path,\n        vc_config_path,\n        args.use_cuda,\n    )\n\n    # query speaker ids of a multi-speaker model.\n    if args.list_speaker_idxs:\n        print(\n            \" > Available speaker ids: (Set --speaker_idx flag to one of these values to use the multi-speaker model.\"\n        )\n        print(synthesizer.tts_model.speaker_manager.name_to_id)\n        return\n\n    # query langauge ids of a multi-lingual model.\n    if args.list_language_idxs:\n        print(\n            \" > Available language ids: (Set --language_idx flag to one of these values to use the multi-lingual model.\"\n        )\n        print(synthesizer.tts_model.language_manager.name_to_id)\n        return\n\n    # check the arguments against a multi-speaker model.\n    if synthesizer.tts_speakers_file and (not args.speaker_idx and not args.speaker_wav):\n        print(\n            \" [!] Looks like you use a multi-speaker model. Define `--speaker_idx` to \"\n            \"select the target speaker. You can list the available speakers for this model by `--list_speaker_idxs`.\"\n        )\n        return\n\n    # RUN THE SYNTHESIS\n    if args.text:\n        print(\" > Text: {}\".format(args.text))\n\n    # kick it\n    if tts_path is not None:\n        wav = synthesizer.tts(\n            args.text,\n            args.speaker_idx,\n            args.language_idx,\n            args.speaker_wav,\n            reference_wav=args.reference_wav,\n            style_wav=args.capacitron_style_wav,\n            style_text=args.capacitron_style_text,\n            reference_speaker_name=args.reference_speaker_idx,\n        )\n    elif vc_path is not None:\n        wav = synthesizer.voice_conversion(\n            source_wav=args.source_wav,\n            target_wav=args.target_wav,\n        )\n\n    # save the results\n    print(\" > Saving output to {}\".format(args.out_path))\n    synthesizer.save_wav(wav, args.out_path)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "TTS/bin/train_encoder.py",
    "content": "#!/usr/bin/env python3\n# -*- coding: utf-8 -*-\n\nimport os\nimport sys\nimport time\nimport traceback\n\nimport torch\nfrom torch.utils.data import DataLoader\nfrom trainer.torch import NoamLR\nfrom trainer.trainer_utils import get_optimizer\n\nfrom TTS.encoder.dataset import EncoderDataset\nfrom TTS.encoder.utils.generic_utils import save_best_model, save_checkpoint, setup_encoder_model\nfrom TTS.encoder.utils.training import init_training\nfrom TTS.encoder.utils.visual import plot_embeddings\nfrom TTS.tts.datasets import load_tts_samples\nfrom TTS.utils.audio import AudioProcessor\nfrom TTS.utils.generic_utils import count_parameters, remove_experiment_folder\nfrom TTS.utils.io import copy_model_files\nfrom TTS.utils.samplers import PerfectBatchSampler\nfrom TTS.utils.training import check_update\n\ntorch.backends.cudnn.enabled = True\ntorch.backends.cudnn.benchmark = True\ntorch.manual_seed(54321)\nuse_cuda = torch.cuda.is_available()\nnum_gpus = torch.cuda.device_count()\nprint(\" > Using CUDA: \", use_cuda)\nprint(\" > Number of GPUs: \", num_gpus)\n\n\ndef setup_loader(ap: AudioProcessor, is_val: bool = False, verbose: bool = False):\n    num_utter_per_class = c.num_utter_per_class if not is_val else c.eval_num_utter_per_class\n    num_classes_in_batch = c.num_classes_in_batch if not is_val else c.eval_num_classes_in_batch\n\n    dataset = EncoderDataset(\n        c,\n        ap,\n        meta_data_eval if is_val else meta_data_train,\n        voice_len=c.voice_len,\n        num_utter_per_class=num_utter_per_class,\n        num_classes_in_batch=num_classes_in_batch,\n        verbose=verbose,\n        augmentation_config=c.audio_augmentation if not is_val else None,\n        use_torch_spec=c.model_params.get(\"use_torch_spec\", False),\n    )\n    # get classes list\n    classes = dataset.get_class_list()\n\n    sampler = PerfectBatchSampler(\n        dataset.items,\n        classes,\n        batch_size=num_classes_in_batch * num_utter_per_class,  # total batch size\n        num_classes_in_batch=num_classes_in_batch,\n        num_gpus=1,\n        shuffle=not is_val,\n        drop_last=True,\n    )\n\n    if len(classes) < num_classes_in_batch:\n        if is_val:\n            raise RuntimeError(\n                f\"config.eval_num_classes_in_batch ({num_classes_in_batch}) need to be <= {len(classes)} (Number total of Classes in the Eval dataset) !\"\n            )\n        raise RuntimeError(\n            f\"config.num_classes_in_batch ({num_classes_in_batch}) need to be <= {len(classes)} (Number total of Classes in the Train dataset) !\"\n        )\n\n    # set the classes to avoid get wrong class_id when the number of training and eval classes are not equal\n    if is_val:\n        dataset.set_classes(train_classes)\n\n    loader = DataLoader(\n        dataset,\n        num_workers=c.num_loader_workers,\n        batch_sampler=sampler,\n        collate_fn=dataset.collate_fn,\n    )\n\n    return loader, classes, dataset.get_map_classid_to_classname()\n\n\ndef evaluation(model, criterion, data_loader, global_step):\n    eval_loss = 0\n    for _, data in enumerate(data_loader):\n        with torch.no_grad():\n            # setup input data\n            inputs, labels = data\n\n            # agroup samples of each class in the batch. perfect sampler produces [3,2,1,3,2,1] we need [3,3,2,2,1,1]\n            labels = torch.transpose(\n                labels.view(c.eval_num_utter_per_class, c.eval_num_classes_in_batch), 0, 1\n            ).reshape(labels.shape)\n            inputs = torch.transpose(\n                inputs.view(c.eval_num_utter_per_class, c.eval_num_classes_in_batch, -1), 0, 1\n            ).reshape(inputs.shape)\n\n            # dispatch data to GPU\n            if use_cuda:\n                inputs = inputs.cuda(non_blocking=True)\n                labels = labels.cuda(non_blocking=True)\n\n            # forward pass model\n            outputs = model(inputs)\n\n            # loss computation\n            loss = criterion(\n                outputs.view(c.eval_num_classes_in_batch, outputs.shape[0] // c.eval_num_classes_in_batch, -1), labels\n            )\n\n            eval_loss += loss.item()\n\n    eval_avg_loss = eval_loss / len(data_loader)\n    # save stats\n    dashboard_logger.eval_stats(global_step, {\"loss\": eval_avg_loss})\n    # plot the last batch in the evaluation\n    figures = {\n        \"UMAP Plot\": plot_embeddings(outputs.detach().cpu().numpy(), c.num_classes_in_batch),\n    }\n    dashboard_logger.eval_figures(global_step, figures)\n    return eval_avg_loss\n\n\ndef train(model, optimizer, scheduler, criterion, data_loader, eval_data_loader, global_step):\n    model.train()\n    best_loss = float(\"inf\")\n    avg_loader_time = 0\n    end_time = time.time()\n    for epoch in range(c.epochs):\n        tot_loss = 0\n        epoch_time = 0\n        for _, data in enumerate(data_loader):\n            start_time = time.time()\n\n            # setup input data\n            inputs, labels = data\n            # agroup samples of each class in the batch. perfect sampler produces [3,2,1,3,2,1] we need [3,3,2,2,1,1]\n            labels = torch.transpose(labels.view(c.num_utter_per_class, c.num_classes_in_batch), 0, 1).reshape(\n                labels.shape\n            )\n            inputs = torch.transpose(inputs.view(c.num_utter_per_class, c.num_classes_in_batch, -1), 0, 1).reshape(\n                inputs.shape\n            )\n            # ToDo: move it to a unit test\n            # labels_converted = torch.transpose(labels.view(c.num_utter_per_class, c.num_classes_in_batch), 0, 1).reshape(labels.shape)\n            # inputs_converted = torch.transpose(inputs.view(c.num_utter_per_class, c.num_classes_in_batch, -1), 0, 1).reshape(inputs.shape)\n            # idx = 0\n            # for j in range(0, c.num_classes_in_batch, 1):\n            #     for i in range(j, len(labels), c.num_classes_in_batch):\n            #         if not torch.all(labels[i].eq(labels_converted[idx])) or not torch.all(inputs[i].eq(inputs_converted[idx])):\n            #             print(\"Invalid\")\n            #             print(labels)\n            #             exit()\n            #         idx += 1\n            # labels = labels_converted\n            # inputs = inputs_converted\n\n            loader_time = time.time() - end_time\n            global_step += 1\n\n            # setup lr\n            if c.lr_decay:\n                scheduler.step()\n            optimizer.zero_grad()\n\n            # dispatch data to GPU\n            if use_cuda:\n                inputs = inputs.cuda(non_blocking=True)\n                labels = labels.cuda(non_blocking=True)\n\n            # forward pass model\n            outputs = model(inputs)\n\n            # loss computation\n            loss = criterion(\n                outputs.view(c.num_classes_in_batch, outputs.shape[0] // c.num_classes_in_batch, -1), labels\n            )\n            loss.backward()\n            grad_norm, _ = check_update(model, c.grad_clip)\n            optimizer.step()\n\n            step_time = time.time() - start_time\n            epoch_time += step_time\n\n            # acumulate the total epoch loss\n            tot_loss += loss.item()\n\n            # Averaged Loader Time\n            num_loader_workers = c.num_loader_workers if c.num_loader_workers > 0 else 1\n            avg_loader_time = (\n                1 / num_loader_workers * loader_time + (num_loader_workers - 1) / num_loader_workers * avg_loader_time\n                if avg_loader_time != 0\n                else loader_time\n            )\n            current_lr = optimizer.param_groups[0][\"lr\"]\n\n            if global_step % c.steps_plot_stats == 0:\n                # Plot Training Epoch Stats\n                train_stats = {\n                    \"loss\": loss.item(),\n                    \"lr\": current_lr,\n                    \"grad_norm\": grad_norm,\n                    \"step_time\": step_time,\n                    \"avg_loader_time\": avg_loader_time,\n                }\n                dashboard_logger.train_epoch_stats(global_step, train_stats)\n                figures = {\n                    \"UMAP Plot\": plot_embeddings(outputs.detach().cpu().numpy(), c.num_classes_in_batch),\n                }\n                dashboard_logger.train_figures(global_step, figures)\n\n            if global_step % c.print_step == 0:\n                print(\n                    \"   | > Step:{}  Loss:{:.5f}  GradNorm:{:.5f}  \"\n                    \"StepTime:{:.2f}  LoaderTime:{:.2f}  AvGLoaderTime:{:.2f}  LR:{:.6f}\".format(\n                        global_step, loss.item(), grad_norm, step_time, loader_time, avg_loader_time, current_lr\n                    ),\n                    flush=True,\n                )\n\n            if global_step % c.save_step == 0:\n                # save model\n                save_checkpoint(model, optimizer, criterion, loss.item(), OUT_PATH, global_step, epoch)\n\n            end_time = time.time()\n\n        print(\"\")\n        print(\n            \">>> Epoch:{}  AvgLoss: {:.5f} GradNorm:{:.5f}  \"\n            \"EpochTime:{:.2f} AvGLoaderTime:{:.2f} \".format(\n                epoch, tot_loss / len(data_loader), grad_norm, epoch_time, avg_loader_time\n            ),\n            flush=True,\n        )\n        # evaluation\n        if c.run_eval:\n            model.eval()\n            eval_loss = evaluation(model, criterion, eval_data_loader, global_step)\n            print(\"\\n\\n\")\n            print(\"--> EVAL PERFORMANCE\")\n            print(\n                \"   | > Epoch:{}  AvgLoss: {:.5f} \".format(epoch, eval_loss),\n                flush=True,\n            )\n            # save the best checkpoint\n            best_loss = save_best_model(model, optimizer, criterion, eval_loss, best_loss, OUT_PATH, global_step, epoch)\n            model.train()\n\n    return best_loss, global_step\n\n\ndef main(args):  # pylint: disable=redefined-outer-name\n    # pylint: disable=global-variable-undefined\n    global meta_data_train\n    global meta_data_eval\n    global train_classes\n\n    ap = AudioProcessor(**c.audio)\n    model = setup_encoder_model(c)\n\n    optimizer = get_optimizer(c.optimizer, c.optimizer_params, c.lr, model)\n\n    # pylint: disable=redefined-outer-name\n    meta_data_train, meta_data_eval = load_tts_samples(c.datasets, eval_split=True)\n\n    train_data_loader, train_classes, map_classid_to_classname = setup_loader(ap, is_val=False, verbose=True)\n    if c.run_eval:\n        eval_data_loader, _, _ = setup_loader(ap, is_val=True, verbose=True)\n    else:\n        eval_data_loader = None\n\n    num_classes = len(train_classes)\n    criterion = model.get_criterion(c, num_classes)\n\n    if c.loss == \"softmaxproto\" and c.model != \"speaker_encoder\":\n        c.map_classid_to_classname = map_classid_to_classname\n        copy_model_files(c, OUT_PATH)\n\n    if args.restore_path:\n        criterion, args.restore_step = model.load_checkpoint(\n            c, args.restore_path, eval=False, use_cuda=use_cuda, criterion=criterion\n        )\n        print(\" > Model restored from step %d\" % args.restore_step, flush=True)\n    else:\n        args.restore_step = 0\n\n    if c.lr_decay:\n        scheduler = NoamLR(optimizer, warmup_steps=c.warmup_steps, last_epoch=args.restore_step - 1)\n    else:\n        scheduler = None\n\n    num_params = count_parameters(model)\n    print(\"\\n > Model has {} parameters\".format(num_params), flush=True)\n\n    if use_cuda:\n        model = model.cuda()\n        criterion.cuda()\n\n    global_step = args.restore_step\n    _, global_step = train(model, optimizer, scheduler, criterion, train_data_loader, eval_data_loader, global_step)\n\n\nif __name__ == \"__main__\":\n    args, c, OUT_PATH, AUDIO_PATH, c_logger, dashboard_logger = init_training()\n\n    try:\n        main(args)\n    except KeyboardInterrupt:\n        remove_experiment_folder(OUT_PATH)\n        try:\n            sys.exit(0)\n        except SystemExit:\n            os._exit(0)  # pylint: disable=protected-access\n    except Exception:  # pylint: disable=broad-except\n        remove_experiment_folder(OUT_PATH)\n        traceback.print_exc()\n        sys.exit(1)\n"
  },
  {
    "path": "TTS/bin/train_tts.py",
    "content": "import os\nfrom dataclasses import dataclass, field\n\nfrom trainer import Trainer, TrainerArgs\n\nfrom TTS.config import load_config, register_config\nfrom TTS.tts.datasets import load_tts_samples\nfrom TTS.tts.models import setup_model\n\n\n@dataclass\nclass TrainTTSArgs(TrainerArgs):\n    config_path: str = field(default=None, metadata={\"help\": \"Path to the config file.\"})\n\n\ndef main():\n    \"\"\"Run `tts` model training directly by a `config.json` file.\"\"\"\n    # init trainer args\n    train_args = TrainTTSArgs()\n    parser = train_args.init_argparse(arg_prefix=\"\")\n\n    # override trainer args from comman-line args\n    args, config_overrides = parser.parse_known_args()\n    train_args.parse_args(args)\n\n    # load config.json and register\n    if args.config_path or args.continue_path:\n        if args.config_path:\n            # init from a file\n            config = load_config(args.config_path)\n            if len(config_overrides) > 0:\n                config.parse_known_args(config_overrides, relaxed_parser=True)\n        elif args.continue_path:\n            # continue from a prev experiment\n            config = load_config(os.path.join(args.continue_path, \"config.json\"))\n            if len(config_overrides) > 0:\n                config.parse_known_args(config_overrides, relaxed_parser=True)\n        else:\n            # init from console args\n            from TTS.config.shared_configs import BaseTrainingConfig  # pylint: disable=import-outside-toplevel\n\n            config_base = BaseTrainingConfig()\n            config_base.parse_known_args(config_overrides)\n            config = register_config(config_base.model)()\n\n    # load training samples\n    train_samples, eval_samples = load_tts_samples(\n        config.datasets,\n        eval_split=True,\n        eval_split_max_size=config.eval_split_max_size,\n        eval_split_size=config.eval_split_size,\n    )\n\n    # init the model from config\n    model = setup_model(config, train_samples + eval_samples)\n\n    # init the trainer and 🚀\n    trainer = Trainer(\n        train_args,\n        model.config,\n        config.output_path,\n        model=model,\n        train_samples=train_samples,\n        eval_samples=eval_samples,\n        parse_command_line_args=False,\n    )\n    trainer.fit()\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "TTS/bin/train_vocoder.py",
    "content": "import os\nfrom dataclasses import dataclass, field\n\nfrom trainer import Trainer, TrainerArgs\n\nfrom TTS.config import load_config, register_config\nfrom TTS.utils.audio import AudioProcessor\nfrom TTS.vocoder.datasets.preprocess import load_wav_data, load_wav_feat_data\nfrom TTS.vocoder.models import setup_model\n\n\n@dataclass\nclass TrainVocoderArgs(TrainerArgs):\n    config_path: str = field(default=None, metadata={\"help\": \"Path to the config file.\"})\n\n\ndef main():\n    \"\"\"Run `tts` model training directly by a `config.json` file.\"\"\"\n    # init trainer args\n    train_args = TrainVocoderArgs()\n    parser = train_args.init_argparse(arg_prefix=\"\")\n\n    # override trainer args from comman-line args\n    args, config_overrides = parser.parse_known_args()\n    train_args.parse_args(args)\n\n    # load config.json and register\n    if args.config_path or args.continue_path:\n        if args.config_path:\n            # init from a file\n            config = load_config(args.config_path)\n            if len(config_overrides) > 0:\n                config.parse_known_args(config_overrides, relaxed_parser=True)\n        elif args.continue_path:\n            # continue from a prev experiment\n            config = load_config(os.path.join(args.continue_path, \"config.json\"))\n            if len(config_overrides) > 0:\n                config.parse_known_args(config_overrides, relaxed_parser=True)\n        else:\n            # init from console args\n            from TTS.config.shared_configs import BaseTrainingConfig  # pylint: disable=import-outside-toplevel\n\n            config_base = BaseTrainingConfig()\n            config_base.parse_known_args(config_overrides)\n            config = register_config(config_base.model)()\n\n    # load training samples\n    if \"feature_path\" in config and config.feature_path:\n        # load pre-computed features\n        print(f\" > Loading features from: {config.feature_path}\")\n        eval_samples, train_samples = load_wav_feat_data(config.data_path, config.feature_path, config.eval_split_size)\n    else:\n        # load data raw wav files\n        eval_samples, train_samples = load_wav_data(config.data_path, config.eval_split_size)\n\n    # setup audio processor\n    ap = AudioProcessor(**config.audio)\n\n    # init the model from config\n    model = setup_model(config)\n\n    # init the trainer and 🚀\n    trainer = Trainer(\n        train_args,\n        config,\n        config.output_path,\n        model=model,\n        train_samples=train_samples,\n        eval_samples=eval_samples,\n        training_assets={\"audio_processor\": ap},\n        parse_command_line_args=False,\n    )\n    trainer.fit()\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "TTS/bin/tune_wavegrad.py",
    "content": "\"\"\"Search a good noise schedule for WaveGrad for a given number of inference iterations\"\"\"\nimport argparse\nfrom itertools import product as cartesian_product\n\nimport numpy as np\nimport torch\nfrom torch.utils.data import DataLoader\nfrom tqdm import tqdm\n\nfrom TTS.config import load_config\nfrom TTS.utils.audio import AudioProcessor\nfrom TTS.vocoder.datasets.preprocess import load_wav_data\nfrom TTS.vocoder.datasets.wavegrad_dataset import WaveGradDataset\nfrom TTS.vocoder.models import setup_model\n\nif __name__ == \"__main__\":\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--model_path\", type=str, help=\"Path to model checkpoint.\")\n    parser.add_argument(\"--config_path\", type=str, help=\"Path to model config file.\")\n    parser.add_argument(\"--data_path\", type=str, help=\"Path to data directory.\")\n    parser.add_argument(\"--output_path\", type=str, help=\"path for output file including file name and extension.\")\n    parser.add_argument(\n        \"--num_iter\",\n        type=int,\n        help=\"Number of model inference iterations that you like to optimize noise schedule for.\",\n    )\n    parser.add_argument(\"--use_cuda\", action=\"store_true\", help=\"enable CUDA.\")\n    parser.add_argument(\"--num_samples\", type=int, default=1, help=\"Number of datasamples used for inference.\")\n    parser.add_argument(\n        \"--search_depth\",\n        type=int,\n        default=3,\n        help=\"Search granularity. Increasing this increases the run-time exponentially.\",\n    )\n\n    # load config\n    args = parser.parse_args()\n    config = load_config(args.config_path)\n\n    # setup audio processor\n    ap = AudioProcessor(**config.audio)\n\n    # load dataset\n    _, train_data = load_wav_data(args.data_path, 0)\n    train_data = train_data[: args.num_samples]\n    dataset = WaveGradDataset(\n        ap=ap,\n        items=train_data,\n        seq_len=-1,\n        hop_len=ap.hop_length,\n        pad_short=config.pad_short,\n        conv_pad=config.conv_pad,\n        is_training=True,\n        return_segments=False,\n        use_noise_augment=False,\n        use_cache=False,\n        verbose=True,\n    )\n    loader = DataLoader(\n        dataset,\n        batch_size=1,\n        shuffle=False,\n        collate_fn=dataset.collate_full_clips,\n        drop_last=False,\n        num_workers=config.num_loader_workers,\n        pin_memory=False,\n    )\n\n    # setup the model\n    model = setup_model(config)\n    if args.use_cuda:\n        model.cuda()\n\n    # setup optimization parameters\n    base_values = sorted(10 * np.random.uniform(size=args.search_depth))\n    print(f\" > base values: {base_values}\")\n    exponents = 10 ** np.linspace(-6, -1, num=args.num_iter)\n    best_error = float(\"inf\")\n    best_schedule = None  # pylint: disable=C0103\n    total_search_iter = len(base_values) ** args.num_iter\n    for base in tqdm(cartesian_product(base_values, repeat=args.num_iter), total=total_search_iter):\n        beta = exponents * base\n        model.compute_noise_level(beta)\n        for data in loader:\n            mel, audio = data\n            y_hat = model.inference(mel.cuda() if args.use_cuda else mel)\n\n            if args.use_cuda:\n                y_hat = y_hat.cpu()\n            y_hat = y_hat.numpy()\n\n            mel_hat = []\n            for i in range(y_hat.shape[0]):\n                m = ap.melspectrogram(y_hat[i, 0])[:, :-1]\n                mel_hat.append(torch.from_numpy(m))\n\n            mel_hat = torch.stack(mel_hat)\n            mse = torch.sum((mel - mel_hat) ** 2).mean()\n            if mse.item() < best_error:\n                best_error = mse.item()\n                best_schedule = {\"beta\": beta}\n                print(f\" > Found a better schedule. - MSE: {mse.item()}\")\n                np.save(args.output_path, best_schedule)\n"
  },
  {
    "path": "TTS/config/__init__.py",
    "content": "import json\nimport os\nimport re\nfrom typing import Dict\n\nimport fsspec\nimport yaml\nfrom coqpit import Coqpit\n\nfrom TTS.config.shared_configs import *\nfrom TTS.utils.generic_utils import find_module\n\n\ndef read_json_with_comments(json_path):\n    \"\"\"for backward compat.\"\"\"\n    # fallback to json\n    with fsspec.open(json_path, \"r\", encoding=\"utf-8\") as f:\n        input_str = f.read()\n    # handle comments\n    input_str = re.sub(r\"\\\\\\n\", \"\", input_str)\n    input_str = re.sub(r\"//.*\\n\", \"\\n\", input_str)\n    data = json.loads(input_str)\n    return data\n\n\ndef register_config(model_name: str) -> Coqpit:\n    \"\"\"Find the right config for the given model name.\n\n    Args:\n        model_name (str): Model name.\n\n    Raises:\n        ModuleNotFoundError: No matching config for the model name.\n\n    Returns:\n        Coqpit: config class.\n    \"\"\"\n    config_class = None\n    config_name = model_name + \"_config\"\n    paths = [\"TTS.tts.configs\", \"TTS.vocoder.configs\", \"TTS.encoder.configs\", \"TTS.vc.configs\"]\n    for path in paths:\n        try:\n            config_class = find_module(path, config_name)\n        except ModuleNotFoundError:\n            pass\n    if config_class is None:\n        raise ModuleNotFoundError(f\" [!] Config for {model_name} cannot be found.\")\n    return config_class\n\n\ndef _process_model_name(config_dict: Dict) -> str:\n    \"\"\"Format the model name as expected. It is a band-aid for the old `vocoder` model names.\n\n    Args:\n        config_dict (Dict): A dictionary including the config fields.\n\n    Returns:\n        str: Formatted modelname.\n    \"\"\"\n    model_name = config_dict[\"model\"] if \"model\" in config_dict else config_dict[\"generator_model\"]\n    model_name = model_name.replace(\"_generator\", \"\").replace(\"_discriminator\", \"\")\n    return model_name\n\n\ndef load_config(config_path: str) -> Coqpit:\n    \"\"\"Import `json` or `yaml` files as TTS configs. First, load the input file as a `dict` and check the model name\n    to find the corresponding Config class. Then initialize the Config.\n\n    Args:\n        config_path (str): path to the config file.\n\n    Raises:\n        TypeError: given config file has an unknown type.\n\n    Returns:\n        Coqpit: TTS config object.\n    \"\"\"\n    config_dict = {}\n    ext = os.path.splitext(config_path)[1]\n    if ext in (\".yml\", \".yaml\"):\n        with fsspec.open(config_path, \"r\", encoding=\"utf-8\") as f:\n            data = yaml.safe_load(f)\n    elif ext == \".json\":\n        try:\n            with fsspec.open(config_path, \"r\", encoding=\"utf-8\") as f:\n                data = json.load(f)\n        except json.decoder.JSONDecodeError:\n            # backwards compat.\n            data = read_json_with_comments(config_path)\n    else:\n        raise TypeError(f\" [!] Unknown config file type {ext}\")\n    config_dict.update(data)\n    model_name = _process_model_name(config_dict)\n    config_class = register_config(model_name.lower())\n    config = config_class()\n    config.from_dict(config_dict)\n    return config\n\n\ndef check_config_and_model_args(config, arg_name, value):\n    \"\"\"Check the give argument in `config.model_args` if exist or in `config` for\n    the given value.\n\n    Return False if the argument does not exist in `config.model_args` or `config`.\n    This is to patch up the compatibility between models with and without `model_args`.\n\n    TODO: Remove this in the future with a unified approach.\n    \"\"\"\n    if hasattr(config, \"model_args\"):\n        if arg_name in config.model_args:\n            return config.model_args[arg_name] == value\n    if hasattr(config, arg_name):\n        return config[arg_name] == value\n    return False\n\n\ndef get_from_config_or_model_args(config, arg_name):\n    \"\"\"Get the given argument from `config.model_args` if exist or in `config`.\"\"\"\n    if hasattr(config, \"model_args\"):\n        if arg_name in config.model_args:\n            return config.model_args[arg_name]\n    return config[arg_name]\n\n\ndef get_from_config_or_model_args_with_default(config, arg_name, def_val):\n    \"\"\"Get the given argument from `config.model_args` if exist or in `config`.\"\"\"\n    if hasattr(config, \"model_args\"):\n        if arg_name in config.model_args:\n            return config.model_args[arg_name]\n    if hasattr(config, arg_name):\n        return config[arg_name]\n    return def_val\n"
  },
  {
    "path": "TTS/config/shared_configs.py",
    "content": "from dataclasses import asdict, dataclass\nfrom typing import List\n\nfrom coqpit import Coqpit, check_argument\nfrom trainer import TrainerConfig\n\n\n@dataclass\nclass BaseAudioConfig(Coqpit):\n    \"\"\"Base config to definge audio processing parameters. It is used to initialize\n    ```TTS.utils.audio.AudioProcessor.```\n\n    Args:\n        fft_size (int):\n            Number of STFT frequency levels aka.size of the linear spectogram frame. Defaults to 1024.\n\n        win_length (int):\n            Each frame of audio is windowed by window of length ```win_length``` and then padded with zeros to match\n            ```fft_size```. Defaults to 1024.\n\n        hop_length (int):\n            Number of audio samples between adjacent STFT columns. Defaults to 1024.\n\n        frame_shift_ms (int):\n            Set ```hop_length``` based on milliseconds and sampling rate.\n\n        frame_length_ms (int):\n            Set ```win_length``` based on milliseconds and sampling rate.\n\n        stft_pad_mode (str):\n            Padding method used in STFT. 'reflect' or 'center'. Defaults to 'reflect'.\n\n        sample_rate (int):\n            Audio sampling rate. Defaults to 22050.\n\n        resample (bool):\n            Enable / Disable resampling audio to ```sample_rate```. Defaults to ```False```.\n\n        preemphasis (float):\n            Preemphasis coefficient. Defaults to 0.0.\n\n        ref_level_db (int): 20\n            Reference Db level to rebase the audio signal and ignore the level below. 20Db is assumed the sound of air.\n            Defaults to 20.\n\n        do_sound_norm (bool):\n            Enable / Disable sound normalization to reconcile the volume differences among samples. Defaults to False.\n\n        log_func (str):\n            Numpy log function used for amplitude to DB conversion. Defaults to 'np.log10'.\n\n        do_trim_silence (bool):\n            Enable / Disable trimming silences at the beginning and the end of the audio clip. Defaults to ```True```.\n\n        do_amp_to_db_linear (bool, optional):\n            enable/disable amplitude to dB conversion of linear spectrograms. Defaults to True.\n\n        do_amp_to_db_mel (bool, optional):\n            enable/disable amplitude to dB conversion of mel spectrograms. Defaults to True.\n\n        pitch_fmax (float, optional):\n            Maximum frequency of the F0 frames. Defaults to ```640```.\n\n        pitch_fmin (float, optional):\n            Minimum frequency of the F0 frames. Defaults to ```1```.\n\n        trim_db (int):\n            Silence threshold used for silence trimming. Defaults to 45.\n\n        do_rms_norm (bool, optional):\n            enable/disable RMS volume normalization when loading an audio file. Defaults to False.\n\n        db_level (int, optional):\n            dB level used for rms normalization. The range is -99 to 0. Defaults to None.\n\n        power (float):\n            Exponent used for expanding spectrogra levels before running Griffin Lim. It helps to reduce the\n            artifacts in the synthesized voice. Defaults to 1.5.\n\n        griffin_lim_iters (int):\n            Number of Griffing Lim iterations. Defaults to 60.\n\n        num_mels (int):\n            Number of mel-basis frames that defines the frame lengths of each mel-spectrogram frame. Defaults to 80.\n\n        mel_fmin (float): Min frequency level used for the mel-basis filters. ~50 for male and ~95 for female voices.\n            It needs to be adjusted for a dataset. Defaults to 0.\n\n        mel_fmax (float):\n            Max frequency level used for the mel-basis filters. It needs to be adjusted for a dataset.\n\n        spec_gain (int):\n            Gain applied when converting amplitude to DB. Defaults to 20.\n\n        signal_norm (bool):\n            enable/disable signal normalization. Defaults to True.\n\n        min_level_db (int):\n            minimum db threshold for the computed melspectrograms. Defaults to -100.\n\n        symmetric_norm (bool):\n            enable/disable symmetric normalization. If set True normalization is performed in the range [-k, k] else\n            [0, k], Defaults to True.\n\n        max_norm (float):\n            ```k``` defining the normalization range. Defaults to 4.0.\n\n        clip_norm (bool):\n            enable/disable clipping the our of range values in the normalized audio signal. Defaults to True.\n\n        stats_path (str):\n            Path to the computed stats file. Defaults to None.\n    \"\"\"\n\n    # stft parameters\n    fft_size: int = 1024\n    win_length: int = 1024\n    hop_length: int = 256\n    frame_shift_ms: int = None\n    frame_length_ms: int = None\n    stft_pad_mode: str = \"reflect\"\n    # audio processing parameters\n    sample_rate: int = 22050\n    resample: bool = False\n    preemphasis: float = 0.0\n    ref_level_db: int = 20\n    do_sound_norm: bool = False\n    log_func: str = \"np.log10\"\n    # silence trimming\n    do_trim_silence: bool = True\n    trim_db: int = 45\n    # rms volume normalization\n    do_rms_norm: bool = False\n    db_level: float = None\n    # griffin-lim params\n    power: float = 1.5\n    griffin_lim_iters: int = 60\n    # mel-spec params\n    num_mels: int = 80\n    mel_fmin: float = 0.0\n    mel_fmax: float = None\n    spec_gain: int = 20\n    do_amp_to_db_linear: bool = True\n    do_amp_to_db_mel: bool = True\n    # f0 params\n    pitch_fmax: float = 640.0\n    pitch_fmin: float = 1.0\n    # normalization params\n    signal_norm: bool = True\n    min_level_db: int = -100\n    symmetric_norm: bool = True\n    max_norm: float = 4.0\n    clip_norm: bool = True\n    stats_path: str = None\n\n    def check_values(\n        self,\n    ):\n        \"\"\"Check config fields\"\"\"\n        c = asdict(self)\n        check_argument(\"num_mels\", c, restricted=True, min_val=10, max_val=2056)\n        check_argument(\"fft_size\", c, restricted=True, min_val=128, max_val=4058)\n        check_argument(\"sample_rate\", c, restricted=True, min_val=512, max_val=100000)\n        check_argument(\n            \"frame_length_ms\",\n            c,\n            restricted=True,\n            min_val=10,\n            max_val=1000,\n            alternative=\"win_length\",\n        )\n        check_argument(\"frame_shift_ms\", c, restricted=True, min_val=1, max_val=1000, alternative=\"hop_length\")\n        check_argument(\"preemphasis\", c, restricted=True, min_val=0, max_val=1)\n        check_argument(\"min_level_db\", c, restricted=True, min_val=-1000, max_val=10)\n        check_argument(\"ref_level_db\", c, restricted=True, min_val=0, max_val=1000)\n        check_argument(\"power\", c, restricted=True, min_val=1, max_val=5)\n        check_argument(\"griffin_lim_iters\", c, restricted=True, min_val=10, max_val=1000)\n\n        # normalization parameters\n        check_argument(\"signal_norm\", c, restricted=True)\n        check_argument(\"symmetric_norm\", c, restricted=True)\n        check_argument(\"max_norm\", c, restricted=True, min_val=0.1, max_val=1000)\n        check_argument(\"clip_norm\", c, restricted=True)\n        check_argument(\"mel_fmin\", c, restricted=True, min_val=0.0, max_val=1000)\n        check_argument(\"mel_fmax\", c, restricted=True, min_val=500.0, allow_none=True)\n        check_argument(\"spec_gain\", c, restricted=True, min_val=1, max_val=100)\n        check_argument(\"do_trim_silence\", c, restricted=True)\n        check_argument(\"trim_db\", c, restricted=True)\n\n\n@dataclass\nclass BaseDatasetConfig(Coqpit):\n    \"\"\"Base config for TTS datasets.\n\n    Args:\n        formatter (str):\n            Formatter name that defines used formatter in ```TTS.tts.datasets.formatter```. Defaults to `\"\"`.\n\n        dataset_name (str):\n            Unique name for the dataset. Defaults to `\"\"`.\n\n        path (str):\n            Root path to the dataset files. Defaults to `\"\"`.\n\n        meta_file_train (str):\n            Name of the dataset meta file. Or a list of speakers to be ignored at training for multi-speaker datasets.\n            Defaults to `\"\"`.\n\n        ignored_speakers (List):\n            List of speakers IDs that are not used at the training. Default None.\n\n        language (str):\n            Language code of the dataset. If defined, it overrides `phoneme_language`. Defaults to `\"\"`.\n\n        phonemizer (str):\n            Phonemizer used for that dataset's language. By default it uses `DEF_LANG_TO_PHONEMIZER`. Defaults to `\"\"`.\n\n        meta_file_val (str):\n            Name of the dataset meta file that defines the instances used at validation.\n\n        meta_file_attn_mask (str):\n            Path to the file that lists the attention mask files used with models that require attention masks to\n            train the duration predictor.\n    \"\"\"\n\n    formatter: str = \"\"\n    dataset_name: str = \"\"\n    path: str = \"\"\n    meta_file_train: str = \"\"\n    ignored_speakers: List[str] = None\n    language: str = \"\"\n    phonemizer: str = \"\"\n    meta_file_val: str = \"\"\n    meta_file_attn_mask: str = \"\"\n\n    def check_values(\n        self,\n    ):\n        \"\"\"Check config fields\"\"\"\n        c = asdict(self)\n        check_argument(\"formatter\", c, restricted=True)\n        check_argument(\"path\", c, restricted=True)\n        check_argument(\"meta_file_train\", c, restricted=True)\n        check_argument(\"meta_file_val\", c, restricted=False)\n        check_argument(\"meta_file_attn_mask\", c, restricted=False)\n\n\n@dataclass\nclass BaseTrainingConfig(TrainerConfig):\n    \"\"\"Base config to define the basic 🐸TTS training parameters that are shared\n    among all the models. It is based on ```Trainer.TrainingConfig```.\n\n    Args:\n        model (str):\n            Name of the model that is used in the training.\n\n        num_loader_workers (int):\n            Number of workers for training time dataloader.\n\n        num_eval_loader_workers (int):\n            Number of workers for evaluation time dataloader.\n    \"\"\"\n\n    model: str = None\n    # dataloading\n    num_loader_workers: int = 0\n    num_eval_loader_workers: int = 0\n    use_noise_augment: bool = False\n"
  },
  {
    "path": "TTS/encoder/README.md",
    "content": "### Speaker Encoder\n\nThis is an implementation of https://arxiv.org/abs/1710.10467. This model can be used for voice and speaker embedding.\n\nWith the code here you can generate d-vectors for both multi-speaker and single-speaker TTS datasets, then visualise and explore them along with the associated audio files in an interactive chart.\n\nBelow is an example showing embedding results of various speakers. You can generate the same plot with the provided notebook as demonstrated in [this video](https://youtu.be/KW3oO7JVa7Q).\n\n![](umap.png)\n\nDownload a pretrained model from [Released Models](https://github.com/mozilla/TTS/wiki/Released-Models) page.\n\nTo run the code, you need to follow the same flow as in TTS.\n\n- Define 'config.json' for your needs. Note that, audio parameters should match your TTS model.\n- Example training call ```python speaker_encoder/train.py --config_path speaker_encoder/config.json --data_path ~/Data/Libri-TTS/train-clean-360```\n- Generate embedding vectors ```python speaker_encoder/compute_embeddings.py --use_cuda true /model/path/best_model.pth model/config/path/config.json dataset/path/ output_path``` . This code parses all .wav files at the given dataset path and generates the same folder structure under the output path with the generated embedding files.\n- Watch training on Tensorboard as in TTS\n"
  },
  {
    "path": "TTS/encoder/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/encoder/configs/base_encoder_config.py",
    "content": "from dataclasses import asdict, dataclass, field\nfrom typing import Dict, List\n\nfrom coqpit import MISSING\n\nfrom TTS.config.shared_configs import BaseAudioConfig, BaseDatasetConfig, BaseTrainingConfig\n\n\n@dataclass\nclass BaseEncoderConfig(BaseTrainingConfig):\n    \"\"\"Defines parameters for a Generic Encoder model.\"\"\"\n\n    model: str = None\n    audio: BaseAudioConfig = field(default_factory=BaseAudioConfig)\n    datasets: List[BaseDatasetConfig] = field(default_factory=lambda: [BaseDatasetConfig()])\n    # model params\n    model_params: Dict = field(\n        default_factory=lambda: {\n            \"model_name\": \"lstm\",\n            \"input_dim\": 80,\n            \"proj_dim\": 256,\n            \"lstm_dim\": 768,\n            \"num_lstm_layers\": 3,\n            \"use_lstm_with_projection\": True,\n        }\n    )\n\n    audio_augmentation: Dict = field(default_factory=lambda: {})\n\n    # training params\n    epochs: int = 10000\n    loss: str = \"angleproto\"\n    grad_clip: float = 3.0\n    lr: float = 0.0001\n    optimizer: str = \"radam\"\n    optimizer_params: Dict = field(default_factory=lambda: {\"betas\": [0.9, 0.999], \"weight_decay\": 0})\n    lr_decay: bool = False\n    warmup_steps: int = 4000\n\n    # logging params\n    tb_model_param_stats: bool = False\n    steps_plot_stats: int = 10\n    save_step: int = 1000\n    print_step: int = 20\n    run_eval: bool = False\n\n    # data loader\n    num_classes_in_batch: int = MISSING\n    num_utter_per_class: int = MISSING\n    eval_num_classes_in_batch: int = None\n    eval_num_utter_per_class: int = None\n\n    num_loader_workers: int = MISSING\n    voice_len: float = 1.6\n\n    def check_values(self):\n        super().check_values()\n        c = asdict(self)\n        assert (\n            c[\"model_params\"][\"input_dim\"] == self.audio.num_mels\n        ), \" [!] model input dimendion must be equal to melspectrogram dimension.\"\n"
  },
  {
    "path": "TTS/encoder/configs/emotion_encoder_config.py",
    "content": "from dataclasses import asdict, dataclass\n\nfrom TTS.encoder.configs.base_encoder_config import BaseEncoderConfig\n\n\n@dataclass\nclass EmotionEncoderConfig(BaseEncoderConfig):\n    \"\"\"Defines parameters for Emotion Encoder model.\"\"\"\n\n    model: str = \"emotion_encoder\"\n    map_classid_to_classname: dict = None\n    class_name_key: str = \"emotion_name\"\n"
  },
  {
    "path": "TTS/encoder/configs/speaker_encoder_config.py",
    "content": "from dataclasses import asdict, dataclass\n\nfrom TTS.encoder.configs.base_encoder_config import BaseEncoderConfig\n\n\n@dataclass\nclass SpeakerEncoderConfig(BaseEncoderConfig):\n    \"\"\"Defines parameters for Speaker Encoder model.\"\"\"\n\n    model: str = \"speaker_encoder\"\n    class_name_key: str = \"speaker_name\"\n"
  },
  {
    "path": "TTS/encoder/dataset.py",
    "content": "import random\n\nimport torch\nfrom torch.utils.data import Dataset\n\nfrom TTS.encoder.utils.generic_utils import AugmentWAV\n\n\nclass EncoderDataset(Dataset):\n    def __init__(\n        self,\n        config,\n        ap,\n        meta_data,\n        voice_len=1.6,\n        num_classes_in_batch=64,\n        num_utter_per_class=10,\n        verbose=False,\n        augmentation_config=None,\n        use_torch_spec=None,\n    ):\n        \"\"\"\n        Args:\n            ap (TTS.tts.utils.AudioProcessor): audio processor object.\n            meta_data (list): list of dataset instances.\n            seq_len (int): voice segment length in seconds.\n            verbose (bool): print diagnostic information.\n        \"\"\"\n        super().__init__()\n        self.config = config\n        self.items = meta_data\n        self.sample_rate = ap.sample_rate\n        self.seq_len = int(voice_len * self.sample_rate)\n        self.num_utter_per_class = num_utter_per_class\n        self.ap = ap\n        self.verbose = verbose\n        self.use_torch_spec = use_torch_spec\n        self.classes, self.items = self.__parse_items()\n\n        self.classname_to_classid = {key: i for i, key in enumerate(self.classes)}\n\n        # Data Augmentation\n        self.augmentator = None\n        self.gaussian_augmentation_config = None\n        if augmentation_config:\n            self.data_augmentation_p = augmentation_config[\"p\"]\n            if self.data_augmentation_p and (\"additive\" in augmentation_config or \"rir\" in augmentation_config):\n                self.augmentator = AugmentWAV(ap, augmentation_config)\n\n            if \"gaussian\" in augmentation_config.keys():\n                self.gaussian_augmentation_config = augmentation_config[\"gaussian\"]\n\n        if self.verbose:\n            print(\"\\n > DataLoader initialization\")\n            print(f\" | > Classes per Batch: {num_classes_in_batch}\")\n            print(f\" | > Number of instances : {len(self.items)}\")\n            print(f\" | > Sequence length: {self.seq_len}\")\n            print(f\" | > Num Classes: {len(self.classes)}\")\n            print(f\" | > Classes: {self.classes}\")\n\n    def load_wav(self, filename):\n        audio = self.ap.load_wav(filename, sr=self.ap.sample_rate)\n        return audio\n\n    def __parse_items(self):\n        class_to_utters = {}\n        for item in self.items:\n            path_ = item[\"audio_file\"]\n            class_name = item[self.config.class_name_key]\n            if class_name in class_to_utters.keys():\n                class_to_utters[class_name].append(path_)\n            else:\n                class_to_utters[class_name] = [\n                    path_,\n                ]\n\n        # skip classes with number of samples >= self.num_utter_per_class\n        class_to_utters = {k: v for (k, v) in class_to_utters.items() if len(v) >= self.num_utter_per_class}\n\n        classes = list(class_to_utters.keys())\n        classes.sort()\n\n        new_items = []\n        for item in self.items:\n            path_ = item[\"audio_file\"]\n            class_name = item[\"emotion_name\"] if self.config.model == \"emotion_encoder\" else item[\"speaker_name\"]\n            # ignore filtered classes\n            if class_name not in classes:\n                continue\n            # ignore small audios\n            if self.load_wav(path_).shape[0] - self.seq_len <= 0:\n                continue\n\n            new_items.append({\"wav_file_path\": path_, \"class_name\": class_name})\n\n        return classes, new_items\n\n    def __len__(self):\n        return len(self.items)\n\n    def get_num_classes(self):\n        return len(self.classes)\n\n    def get_class_list(self):\n        return self.classes\n\n    def set_classes(self, classes):\n        self.classes = classes\n        self.classname_to_classid = {key: i for i, key in enumerate(self.classes)}\n\n    def get_map_classid_to_classname(self):\n        return dict((c_id, c_n) for c_n, c_id in self.classname_to_classid.items())\n\n    def __getitem__(self, idx):\n        return self.items[idx]\n\n    def collate_fn(self, batch):\n        # get the batch class_ids\n        labels = []\n        feats = []\n        for item in batch:\n            utter_path = item[\"wav_file_path\"]\n            class_name = item[\"class_name\"]\n\n            # get classid\n            class_id = self.classname_to_classid[class_name]\n            # load wav file\n            wav = self.load_wav(utter_path)\n            offset = random.randint(0, wav.shape[0] - self.seq_len)\n            wav = wav[offset : offset + self.seq_len]\n\n            if self.augmentator is not None and self.data_augmentation_p:\n                if random.random() < self.data_augmentation_p:\n                    wav = self.augmentator.apply_one(wav)\n\n            if not self.use_torch_spec:\n                mel = self.ap.melspectrogram(wav)\n                feats.append(torch.FloatTensor(mel))\n            else:\n                feats.append(torch.FloatTensor(wav))\n\n            labels.append(class_id)\n\n        feats = torch.stack(feats)\n        labels = torch.LongTensor(labels)\n\n        return feats, labels\n"
  },
  {
    "path": "TTS/encoder/losses.py",
    "content": "import torch\nimport torch.nn.functional as F\nfrom torch import nn\n\n\n# adapted from https://github.com/cvqluu/GE2E-Loss\nclass GE2ELoss(nn.Module):\n    def __init__(self, init_w=10.0, init_b=-5.0, loss_method=\"softmax\"):\n        \"\"\"\n        Implementation of the Generalized End-to-End loss defined in https://arxiv.org/abs/1710.10467 [1]\n        Accepts an input of size (N, M, D)\n            where N is the number of speakers in the batch,\n            M is the number of utterances per speaker,\n            and D is the dimensionality of the embedding vector (e.g. d-vector)\n        Args:\n            - init_w (float): defines the initial value of w in Equation (5) of [1]\n            - init_b (float): definies the initial value of b in Equation (5) of [1]\n        \"\"\"\n        super().__init__()\n        # pylint: disable=E1102\n        self.w = nn.Parameter(torch.tensor(init_w))\n        # pylint: disable=E1102\n        self.b = nn.Parameter(torch.tensor(init_b))\n        self.loss_method = loss_method\n\n        print(\" > Initialized Generalized End-to-End loss\")\n\n        assert self.loss_method in [\"softmax\", \"contrast\"]\n\n        if self.loss_method == \"softmax\":\n            self.embed_loss = self.embed_loss_softmax\n        if self.loss_method == \"contrast\":\n            self.embed_loss = self.embed_loss_contrast\n\n    # pylint: disable=R0201\n    def calc_new_centroids(self, dvecs, centroids, spkr, utt):\n        \"\"\"\n        Calculates the new centroids excluding the reference utterance\n        \"\"\"\n        excl = torch.cat((dvecs[spkr, :utt], dvecs[spkr, utt + 1 :]))\n        excl = torch.mean(excl, 0)\n        new_centroids = []\n        for i, centroid in enumerate(centroids):\n            if i == spkr:\n                new_centroids.append(excl)\n            else:\n                new_centroids.append(centroid)\n        return torch.stack(new_centroids)\n\n    def calc_cosine_sim(self, dvecs, centroids):\n        \"\"\"\n        Make the cosine similarity matrix with dims (N,M,N)\n        \"\"\"\n        cos_sim_matrix = []\n        for spkr_idx, speaker in enumerate(dvecs):\n            cs_row = []\n            for utt_idx, utterance in enumerate(speaker):\n                new_centroids = self.calc_new_centroids(dvecs, centroids, spkr_idx, utt_idx)\n                # vector based cosine similarity for speed\n                cs_row.append(\n                    torch.clamp(\n                        torch.mm(\n                            utterance.unsqueeze(1).transpose(0, 1),\n                            new_centroids.transpose(0, 1),\n                        )\n                        / (torch.norm(utterance) * torch.norm(new_centroids, dim=1)),\n                        1e-6,\n                    )\n                )\n            cs_row = torch.cat(cs_row, dim=0)\n            cos_sim_matrix.append(cs_row)\n        return torch.stack(cos_sim_matrix)\n\n    # pylint: disable=R0201\n    def embed_loss_softmax(self, dvecs, cos_sim_matrix):\n        \"\"\"\n        Calculates the loss on each embedding $L(e_{ji})$ by taking softmax\n        \"\"\"\n        N, M, _ = dvecs.shape\n        L = []\n        for j in range(N):\n            L_row = []\n            for i in range(M):\n                L_row.append(-F.log_softmax(cos_sim_matrix[j, i], 0)[j])\n            L_row = torch.stack(L_row)\n            L.append(L_row)\n        return torch.stack(L)\n\n    # pylint: disable=R0201\n    def embed_loss_contrast(self, dvecs, cos_sim_matrix):\n        \"\"\"\n        Calculates the loss on each embedding $L(e_{ji})$ by contrast loss with closest centroid\n        \"\"\"\n        N, M, _ = dvecs.shape\n        L = []\n        for j in range(N):\n            L_row = []\n            for i in range(M):\n                centroids_sigmoids = torch.sigmoid(cos_sim_matrix[j, i])\n                excl_centroids_sigmoids = torch.cat((centroids_sigmoids[:j], centroids_sigmoids[j + 1 :]))\n                L_row.append(1.0 - torch.sigmoid(cos_sim_matrix[j, i, j]) + torch.max(excl_centroids_sigmoids))\n            L_row = torch.stack(L_row)\n            L.append(L_row)\n        return torch.stack(L)\n\n    def forward(self, x, _label=None):\n        \"\"\"\n        Calculates the GE2E loss for an input of dimensions (num_speakers, num_utts_per_speaker, dvec_feats)\n        \"\"\"\n\n        assert x.size()[1] >= 2\n\n        centroids = torch.mean(x, 1)\n        cos_sim_matrix = self.calc_cosine_sim(x, centroids)\n        torch.clamp(self.w, 1e-6)\n        cos_sim_matrix = self.w * cos_sim_matrix + self.b\n        L = self.embed_loss(x, cos_sim_matrix)\n        return L.mean()\n\n\n# adapted from https://github.com/clovaai/voxceleb_trainer/blob/master/loss/angleproto.py\nclass AngleProtoLoss(nn.Module):\n    \"\"\"\n    Implementation of the Angular Prototypical loss defined in https://arxiv.org/abs/2003.11982\n        Accepts an input of size (N, M, D)\n            where N is the number of speakers in the batch,\n            M is the number of utterances per speaker,\n            and D is the dimensionality of the embedding vector\n        Args:\n            - init_w (float): defines the initial value of w\n            - init_b (float): definies the initial value of b\n    \"\"\"\n\n    def __init__(self, init_w=10.0, init_b=-5.0):\n        super().__init__()\n        # pylint: disable=E1102\n        self.w = nn.Parameter(torch.tensor(init_w))\n        # pylint: disable=E1102\n        self.b = nn.Parameter(torch.tensor(init_b))\n        self.criterion = torch.nn.CrossEntropyLoss()\n\n        print(\" > Initialized Angular Prototypical loss\")\n\n    def forward(self, x, _label=None):\n        \"\"\"\n        Calculates the AngleProto loss for an input of dimensions (num_speakers, num_utts_per_speaker, dvec_feats)\n        \"\"\"\n\n        assert x.size()[1] >= 2\n\n        out_anchor = torch.mean(x[:, 1:, :], 1)\n        out_positive = x[:, 0, :]\n        num_speakers = out_anchor.size()[0]\n\n        cos_sim_matrix = F.cosine_similarity(\n            out_positive.unsqueeze(-1).expand(-1, -1, num_speakers),\n            out_anchor.unsqueeze(-1).expand(-1, -1, num_speakers).transpose(0, 2),\n        )\n        torch.clamp(self.w, 1e-6)\n        cos_sim_matrix = cos_sim_matrix * self.w + self.b\n        label = torch.arange(num_speakers).to(cos_sim_matrix.device)\n        L = self.criterion(cos_sim_matrix, label)\n        return L\n\n\nclass SoftmaxLoss(nn.Module):\n    \"\"\"\n    Implementation of the Softmax loss as defined in https://arxiv.org/abs/2003.11982\n        Args:\n            - embedding_dim (float): speaker embedding dim\n            - n_speakers (float): number of speakers\n    \"\"\"\n\n    def __init__(self, embedding_dim, n_speakers):\n        super().__init__()\n\n        self.criterion = torch.nn.CrossEntropyLoss()\n        self.fc = nn.Linear(embedding_dim, n_speakers)\n\n        print(\"Initialised Softmax Loss\")\n\n    def forward(self, x, label=None):\n        # reshape for compatibility\n        x = x.reshape(-1, x.size()[-1])\n        label = label.reshape(-1)\n\n        x = self.fc(x)\n        L = self.criterion(x, label)\n\n        return L\n\n    def inference(self, embedding):\n        x = self.fc(embedding)\n        activations = torch.nn.functional.softmax(x, dim=1).squeeze(0)\n        class_id = torch.argmax(activations)\n        return class_id\n\n\nclass SoftmaxAngleProtoLoss(nn.Module):\n    \"\"\"\n    Implementation of the Softmax AnglePrototypical loss as defined in https://arxiv.org/abs/2009.14153\n        Args:\n            - embedding_dim (float): speaker embedding dim\n            - n_speakers (float): number of speakers\n            - init_w (float): defines the initial value of w\n            - init_b (float): definies the initial value of b\n    \"\"\"\n\n    def __init__(self, embedding_dim, n_speakers, init_w=10.0, init_b=-5.0):\n        super().__init__()\n\n        self.softmax = SoftmaxLoss(embedding_dim, n_speakers)\n        self.angleproto = AngleProtoLoss(init_w, init_b)\n\n        print(\"Initialised SoftmaxAnglePrototypical Loss\")\n\n    def forward(self, x, label=None):\n        \"\"\"\n        Calculates the SoftmaxAnglePrototypical loss for an input of dimensions (num_speakers, num_utts_per_speaker, dvec_feats)\n        \"\"\"\n\n        Lp = self.angleproto(x)\n\n        Ls = self.softmax(x, label)\n\n        return Ls + Lp\n"
  },
  {
    "path": "TTS/encoder/models/base_encoder.py",
    "content": "import numpy as np\nimport torch\nimport torchaudio\nfrom coqpit import Coqpit\nfrom torch import nn\n\nfrom TTS.encoder.losses import AngleProtoLoss, GE2ELoss, SoftmaxAngleProtoLoss\nfrom TTS.utils.generic_utils import set_init_dict\nfrom TTS.utils.io import load_fsspec\n\n\nclass PreEmphasis(nn.Module):\n    def __init__(self, coefficient=0.97):\n        super().__init__()\n        self.coefficient = coefficient\n        self.register_buffer(\"filter\", torch.FloatTensor([-self.coefficient, 1.0]).unsqueeze(0).unsqueeze(0))\n\n    def forward(self, x):\n        assert len(x.size()) == 2\n\n        x = torch.nn.functional.pad(x.unsqueeze(1), (1, 0), \"reflect\")\n        return torch.nn.functional.conv1d(x, self.filter).squeeze(1)\n\n\nclass BaseEncoder(nn.Module):\n    \"\"\"Base `encoder` class. Every new `encoder` model must inherit this.\n\n    It defines common `encoder` specific functions.\n    \"\"\"\n\n    # pylint: disable=W0102\n    def __init__(self):\n        super(BaseEncoder, self).__init__()\n\n    def get_torch_mel_spectrogram_class(self, audio_config):\n        return torch.nn.Sequential(\n            PreEmphasis(audio_config[\"preemphasis\"]),\n            # TorchSTFT(\n            #     n_fft=audio_config[\"fft_size\"],\n            #     hop_length=audio_config[\"hop_length\"],\n            #     win_length=audio_config[\"win_length\"],\n            #     sample_rate=audio_config[\"sample_rate\"],\n            #     window=\"hamming_window\",\n            #     mel_fmin=0.0,\n            #     mel_fmax=None,\n            #     use_htk=True,\n            #     do_amp_to_db=False,\n            #     n_mels=audio_config[\"num_mels\"],\n            #     power=2.0,\n            #     use_mel=True,\n            #     mel_norm=None,\n            # )\n            torchaudio.transforms.MelSpectrogram(\n                sample_rate=audio_config[\"sample_rate\"],\n                n_fft=audio_config[\"fft_size\"],\n                win_length=audio_config[\"win_length\"],\n                hop_length=audio_config[\"hop_length\"],\n                window_fn=torch.hamming_window,\n                n_mels=audio_config[\"num_mels\"],\n            ),\n        )\n\n    @torch.no_grad()\n    def inference(self, x, l2_norm=True):\n        return self.forward(x, l2_norm)\n\n    @torch.no_grad()\n    def compute_embedding(self, x, num_frames=250, num_eval=10, return_mean=True, l2_norm=True):\n        \"\"\"\n        Generate embeddings for a batch of utterances\n        x: 1xTxD\n        \"\"\"\n        # map to the waveform size\n        if self.use_torch_spec:\n            num_frames = num_frames * self.audio_config[\"hop_length\"]\n\n        max_len = x.shape[1]\n\n        if max_len < num_frames:\n            num_frames = max_len\n\n        offsets = np.linspace(0, max_len - num_frames, num=num_eval)\n\n        frames_batch = []\n        for offset in offsets:\n            offset = int(offset)\n            end_offset = int(offset + num_frames)\n            frames = x[:, offset:end_offset]\n            frames_batch.append(frames)\n\n        frames_batch = torch.cat(frames_batch, dim=0)\n        embeddings = self.inference(frames_batch, l2_norm=l2_norm)\n\n        if return_mean:\n            embeddings = torch.mean(embeddings, dim=0, keepdim=True)\n        return embeddings\n\n    def get_criterion(self, c: Coqpit, num_classes=None):\n        if c.loss == \"ge2e\":\n            criterion = GE2ELoss(loss_method=\"softmax\")\n        elif c.loss == \"angleproto\":\n            criterion = AngleProtoLoss()\n        elif c.loss == \"softmaxproto\":\n            criterion = SoftmaxAngleProtoLoss(c.model_params[\"proj_dim\"], num_classes)\n        else:\n            raise Exception(\"The %s  not is a loss supported\" % c.loss)\n        return criterion\n\n    def load_checkpoint(\n        self,\n        config: Coqpit,\n        checkpoint_path: str,\n        eval: bool = False,\n        use_cuda: bool = False,\n        criterion=None,\n        cache=False,\n    ):\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), cache=cache)\n        try:\n            self.load_state_dict(state[\"model\"])\n            print(\" > Model fully restored. \")\n        except (KeyError, RuntimeError) as error:\n            # If eval raise the error\n            if eval:\n                raise error\n\n            print(\" > Partial model initialization.\")\n            model_dict = self.state_dict()\n            model_dict = set_init_dict(model_dict, state[\"model\"], c)\n            self.load_state_dict(model_dict)\n            del model_dict\n\n        # load the criterion for restore_path\n        if criterion is not None and \"criterion\" in state:\n            try:\n                criterion.load_state_dict(state[\"criterion\"])\n            except (KeyError, RuntimeError) as error:\n                print(\" > Criterion load ignored because of:\", error)\n\n        # instance and load the criterion for the encoder classifier in inference time\n        if (\n            eval\n            and criterion is None\n            and \"criterion\" in state\n            and getattr(config, \"map_classid_to_classname\", None) is not None\n        ):\n            criterion = self.get_criterion(config, len(config.map_classid_to_classname))\n            criterion.load_state_dict(state[\"criterion\"])\n\n        if use_cuda:\n            self.cuda()\n            if criterion is not None:\n                criterion = criterion.cuda()\n\n        if eval:\n            self.eval()\n            assert not self.training\n\n        if not eval:\n            return criterion, state[\"step\"]\n        return criterion\n"
  },
  {
    "path": "TTS/encoder/models/lstm.py",
    "content": "import torch\nfrom torch import nn\n\nfrom TTS.encoder.models.base_encoder import BaseEncoder\n\n\nclass LSTMWithProjection(nn.Module):\n    def __init__(self, input_size, hidden_size, proj_size):\n        super().__init__()\n        self.input_size = input_size\n        self.hidden_size = hidden_size\n        self.proj_size = proj_size\n        self.lstm = nn.LSTM(input_size, hidden_size, batch_first=True)\n        self.linear = nn.Linear(hidden_size, proj_size, bias=False)\n\n    def forward(self, x):\n        self.lstm.flatten_parameters()\n        o, (_, _) = self.lstm(x)\n        return self.linear(o)\n\n\nclass LSTMWithoutProjection(nn.Module):\n    def __init__(self, input_dim, lstm_dim, proj_dim, num_lstm_layers):\n        super().__init__()\n        self.lstm = nn.LSTM(input_size=input_dim, hidden_size=lstm_dim, num_layers=num_lstm_layers, batch_first=True)\n        self.linear = nn.Linear(lstm_dim, proj_dim, bias=True)\n        self.relu = nn.ReLU()\n\n    def forward(self, x):\n        _, (hidden, _) = self.lstm(x)\n        return self.relu(self.linear(hidden[-1]))\n\n\nclass LSTMSpeakerEncoder(BaseEncoder):\n    def __init__(\n        self,\n        input_dim,\n        proj_dim=256,\n        lstm_dim=768,\n        num_lstm_layers=3,\n        use_lstm_with_projection=True,\n        use_torch_spec=False,\n        audio_config=None,\n    ):\n        super().__init__()\n        self.use_lstm_with_projection = use_lstm_with_projection\n        self.use_torch_spec = use_torch_spec\n        self.audio_config = audio_config\n        self.proj_dim = proj_dim\n\n        layers = []\n        # choise LSTM layer\n        if use_lstm_with_projection:\n            layers.append(LSTMWithProjection(input_dim, lstm_dim, proj_dim))\n            for _ in range(num_lstm_layers - 1):\n                layers.append(LSTMWithProjection(proj_dim, lstm_dim, proj_dim))\n            self.layers = nn.Sequential(*layers)\n        else:\n            self.layers = LSTMWithoutProjection(input_dim, lstm_dim, proj_dim, num_lstm_layers)\n\n        self.instancenorm = nn.InstanceNorm1d(input_dim)\n\n        if self.use_torch_spec:\n            self.torch_spec = self.get_torch_mel_spectrogram_class(audio_config)\n        else:\n            self.torch_spec = None\n\n        self._init_layers()\n\n    def _init_layers(self):\n        for name, param in self.layers.named_parameters():\n            if \"bias\" in name:\n                nn.init.constant_(param, 0.0)\n            elif \"weight\" in name:\n                nn.init.xavier_normal_(param)\n\n    def forward(self, x, l2_norm=True):\n        \"\"\"Forward pass of the model.\n\n        Args:\n            x (Tensor): Raw waveform signal or spectrogram frames. If input is a waveform, `torch_spec` must be `True`\n                to compute the spectrogram on-the-fly.\n            l2_norm (bool): Whether to L2-normalize the outputs.\n\n        Shapes:\n            - x: :math:`(N, 1, T_{in})` or :math:`(N, D_{spec}, T_{in})`\n        \"\"\"\n        with torch.no_grad():\n            with torch.cuda.amp.autocast(enabled=False):\n                if self.use_torch_spec:\n                    x.squeeze_(1)\n                    x = self.torch_spec(x)\n                x = self.instancenorm(x).transpose(1, 2)\n        d = self.layers(x)\n        if self.use_lstm_with_projection:\n            d = d[:, -1]\n        if l2_norm:\n            d = torch.nn.functional.normalize(d, p=2, dim=1)\n        return d\n"
  },
  {
    "path": "TTS/encoder/models/resnet.py",
    "content": "import torch\nfrom torch import nn\n\n# from TTS.utils.audio.torch_transforms import TorchSTFT\nfrom TTS.encoder.models.base_encoder import BaseEncoder\n\n\nclass SELayer(nn.Module):\n    def __init__(self, channel, reduction=8):\n        super(SELayer, self).__init__()\n        self.avg_pool = nn.AdaptiveAvgPool2d(1)\n        self.fc = nn.Sequential(\n            nn.Linear(channel, channel // reduction),\n            nn.ReLU(inplace=True),\n            nn.Linear(channel // reduction, channel),\n            nn.Sigmoid(),\n        )\n\n    def forward(self, x):\n        b, c, _, _ = x.size()\n        y = self.avg_pool(x).view(b, c)\n        y = self.fc(y).view(b, c, 1, 1)\n        return x * y\n\n\nclass SEBasicBlock(nn.Module):\n    expansion = 1\n\n    def __init__(self, inplanes, planes, stride=1, downsample=None, reduction=8):\n        super(SEBasicBlock, self).__init__()\n        self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=3, stride=stride, padding=1, bias=False)\n        self.bn1 = nn.BatchNorm2d(planes)\n        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, padding=1, bias=False)\n        self.bn2 = nn.BatchNorm2d(planes)\n        self.relu = nn.ReLU(inplace=True)\n        self.se = SELayer(planes, reduction)\n        self.downsample = downsample\n        self.stride = stride\n\n    def forward(self, x):\n        residual = x\n\n        out = self.conv1(x)\n        out = self.relu(out)\n        out = self.bn1(out)\n\n        out = self.conv2(out)\n        out = self.bn2(out)\n        out = self.se(out)\n\n        if self.downsample is not None:\n            residual = self.downsample(x)\n\n        out += residual\n        out = self.relu(out)\n        return out\n\n\nclass ResNetSpeakerEncoder(BaseEncoder):\n    \"\"\"Implementation of the model H/ASP without batch normalization in speaker embedding. This model was proposed in: https://arxiv.org/abs/2009.14153\n    Adapted from: https://github.com/clovaai/voxceleb_trainer\n    \"\"\"\n\n    # pylint: disable=W0102\n    def __init__(\n        self,\n        input_dim=64,\n        proj_dim=512,\n        layers=[3, 4, 6, 3],\n        num_filters=[32, 64, 128, 256],\n        encoder_type=\"ASP\",\n        log_input=False,\n        use_torch_spec=False,\n        audio_config=None,\n    ):\n        super(ResNetSpeakerEncoder, self).__init__()\n\n        self.encoder_type = encoder_type\n        self.input_dim = input_dim\n        self.log_input = log_input\n        self.use_torch_spec = use_torch_spec\n        self.audio_config = audio_config\n        self.proj_dim = proj_dim\n\n        self.conv1 = nn.Conv2d(1, num_filters[0], kernel_size=3, stride=1, padding=1)\n        self.relu = nn.ReLU(inplace=True)\n        self.bn1 = nn.BatchNorm2d(num_filters[0])\n\n        self.inplanes = num_filters[0]\n        self.layer1 = self.create_layer(SEBasicBlock, num_filters[0], layers[0])\n        self.layer2 = self.create_layer(SEBasicBlock, num_filters[1], layers[1], stride=(2, 2))\n        self.layer3 = self.create_layer(SEBasicBlock, num_filters[2], layers[2], stride=(2, 2))\n        self.layer4 = self.create_layer(SEBasicBlock, num_filters[3], layers[3], stride=(2, 2))\n\n        self.instancenorm = nn.InstanceNorm1d(input_dim)\n\n        if self.use_torch_spec:\n            self.torch_spec = self.get_torch_mel_spectrogram_class(audio_config)\n        else:\n            self.torch_spec = None\n\n        outmap_size = int(self.input_dim / 8)\n\n        self.attention = nn.Sequential(\n            nn.Conv1d(num_filters[3] * outmap_size, 128, kernel_size=1),\n            nn.ReLU(),\n            nn.BatchNorm1d(128),\n            nn.Conv1d(128, num_filters[3] * outmap_size, kernel_size=1),\n            nn.Softmax(dim=2),\n        )\n\n        if self.encoder_type == \"SAP\":\n            out_dim = num_filters[3] * outmap_size\n        elif self.encoder_type == \"ASP\":\n            out_dim = num_filters[3] * outmap_size * 2\n        else:\n            raise ValueError(\"Undefined encoder\")\n\n        self.fc = nn.Linear(out_dim, proj_dim)\n\n        self._init_layers()\n\n    def _init_layers(self):\n        for m in self.modules():\n            if isinstance(m, nn.Conv2d):\n                nn.init.kaiming_normal_(m.weight, mode=\"fan_out\", nonlinearity=\"relu\")\n            elif isinstance(m, nn.BatchNorm2d):\n                nn.init.constant_(m.weight, 1)\n                nn.init.constant_(m.bias, 0)\n\n    def create_layer(self, block, planes, blocks, stride=1):\n        downsample = None\n        if stride != 1 or self.inplanes != planes * block.expansion:\n            downsample = nn.Sequential(\n                nn.Conv2d(self.inplanes, planes * block.expansion, kernel_size=1, stride=stride, bias=False),\n                nn.BatchNorm2d(planes * block.expansion),\n            )\n\n        layers = []\n        layers.append(block(self.inplanes, planes, stride, downsample))\n        self.inplanes = planes * block.expansion\n        for _ in range(1, blocks):\n            layers.append(block(self.inplanes, planes))\n\n        return nn.Sequential(*layers)\n\n    # pylint: disable=R0201\n    def new_parameter(self, *size):\n        out = nn.Parameter(torch.FloatTensor(*size))\n        nn.init.xavier_normal_(out)\n        return out\n\n    def forward(self, x, l2_norm=False):\n        \"\"\"Forward pass of the model.\n\n        Args:\n            x (Tensor): Raw waveform signal or spectrogram frames. If input is a waveform, `torch_spec` must be `True`\n                to compute the spectrogram on-the-fly.\n            l2_norm (bool): Whether to L2-normalize the outputs.\n\n        Shapes:\n            - x: :math:`(N, 1, T_{in})` or :math:`(N, D_{spec}, T_{in})`\n        \"\"\"\n        x.squeeze_(1)\n        # if you torch spec compute it otherwise use the mel spec computed by the AP\n        if self.use_torch_spec:\n            x = self.torch_spec(x)\n\n        if self.log_input:\n            x = (x + 1e-6).log()\n        x = self.instancenorm(x).unsqueeze(1)\n\n        x = self.conv1(x)\n        x = self.relu(x)\n        x = self.bn1(x)\n\n        x = self.layer1(x)\n        x = self.layer2(x)\n        x = self.layer3(x)\n        x = self.layer4(x)\n\n        x = x.reshape(x.size()[0], -1, x.size()[-1])\n\n        w = self.attention(x)\n\n        if self.encoder_type == \"SAP\":\n            x = torch.sum(x * w, dim=2)\n        elif self.encoder_type == \"ASP\":\n            mu = torch.sum(x * w, dim=2)\n            sg = torch.sqrt((torch.sum((x**2) * w, dim=2) - mu**2).clamp(min=1e-5))\n            x = torch.cat((mu, sg), 1)\n\n        x = x.view(x.size()[0], -1)\n        x = self.fc(x)\n\n        if l2_norm:\n            x = torch.nn.functional.normalize(x, p=2, dim=1)\n        return x\n"
  },
  {
    "path": "TTS/encoder/requirements.txt",
    "content": "umap-learn\nnumpy>=1.17.0\n"
  },
  {
    "path": "TTS/encoder/utils/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/encoder/utils/generic_utils.py",
    "content": "import datetime\nimport glob\nimport os\nimport random\nimport re\n\nimport numpy as np\nfrom scipy import signal\n\nfrom TTS.encoder.models.lstm import LSTMSpeakerEncoder\nfrom TTS.encoder.models.resnet import ResNetSpeakerEncoder\nfrom TTS.utils.io import save_fsspec\n\n\nclass AugmentWAV(object):\n    def __init__(self, ap, augmentation_config):\n        self.ap = ap\n        self.use_additive_noise = False\n\n        if \"additive\" in augmentation_config.keys():\n            self.additive_noise_config = augmentation_config[\"additive\"]\n            additive_path = self.additive_noise_config[\"sounds_path\"]\n            if additive_path:\n                self.use_additive_noise = True\n                # get noise types\n                self.additive_noise_types = []\n                for key in self.additive_noise_config.keys():\n                    if isinstance(self.additive_noise_config[key], dict):\n                        self.additive_noise_types.append(key)\n\n                additive_files = glob.glob(os.path.join(additive_path, \"**/*.wav\"), recursive=True)\n\n                self.noise_list = {}\n\n                for wav_file in additive_files:\n                    noise_dir = wav_file.replace(additive_path, \"\").split(os.sep)[0]\n                    # ignore not listed directories\n                    if noise_dir not in self.additive_noise_types:\n                        continue\n                    if not noise_dir in self.noise_list:\n                        self.noise_list[noise_dir] = []\n                    self.noise_list[noise_dir].append(wav_file)\n\n                print(\n                    f\" | > Using Additive Noise Augmentation: with {len(additive_files)} audios instances from {self.additive_noise_types}\"\n                )\n\n        self.use_rir = False\n\n        if \"rir\" in augmentation_config.keys():\n            self.rir_config = augmentation_config[\"rir\"]\n            if self.rir_config[\"rir_path\"]:\n                self.rir_files = glob.glob(os.path.join(self.rir_config[\"rir_path\"], \"**/*.wav\"), recursive=True)\n                self.use_rir = True\n\n            print(f\" | > Using RIR Noise Augmentation: with {len(self.rir_files)} audios instances\")\n\n        self.create_augmentation_global_list()\n\n    def create_augmentation_global_list(self):\n        if self.use_additive_noise:\n            self.global_noise_list = self.additive_noise_types\n        else:\n            self.global_noise_list = []\n        if self.use_rir:\n            self.global_noise_list.append(\"RIR_AUG\")\n\n    def additive_noise(self, noise_type, audio):\n        clean_db = 10 * np.log10(np.mean(audio**2) + 1e-4)\n\n        noise_list = random.sample(\n            self.noise_list[noise_type],\n            random.randint(\n                self.additive_noise_config[noise_type][\"min_num_noises\"],\n                self.additive_noise_config[noise_type][\"max_num_noises\"],\n            ),\n        )\n\n        audio_len = audio.shape[0]\n        noises_wav = None\n        for noise in noise_list:\n            noiseaudio = self.ap.load_wav(noise, sr=self.ap.sample_rate)[:audio_len]\n\n            if noiseaudio.shape[0] < audio_len:\n                continue\n\n            noise_snr = random.uniform(\n                self.additive_noise_config[noise_type][\"min_snr_in_db\"],\n                self.additive_noise_config[noise_type][\"max_num_noises\"],\n            )\n            noise_db = 10 * np.log10(np.mean(noiseaudio**2) + 1e-4)\n            noise_wav = np.sqrt(10 ** ((clean_db - noise_db - noise_snr) / 10)) * noiseaudio\n\n            if noises_wav is None:\n                noises_wav = noise_wav\n            else:\n                noises_wav += noise_wav\n\n        # if all possible files is less than audio, choose other files\n        if noises_wav is None:\n            return self.additive_noise(noise_type, audio)\n\n        return audio + noises_wav\n\n    def reverberate(self, audio):\n        audio_len = audio.shape[0]\n\n        rir_file = random.choice(self.rir_files)\n        rir = self.ap.load_wav(rir_file, sr=self.ap.sample_rate)\n        rir = rir / np.sqrt(np.sum(rir**2))\n        return signal.convolve(audio, rir, mode=self.rir_config[\"conv_mode\"])[:audio_len]\n\n    def apply_one(self, audio):\n        noise_type = random.choice(self.global_noise_list)\n        if noise_type == \"RIR_AUG\":\n            return self.reverberate(audio)\n\n        return self.additive_noise(noise_type, audio)\n\n\ndef to_camel(text):\n    text = text.capitalize()\n    return re.sub(r\"(?!^)_([a-zA-Z])\", lambda m: m.group(1).upper(), text)\n\n\ndef setup_encoder_model(config: \"Coqpit\"):\n    if config.model_params[\"model_name\"].lower() == \"lstm\":\n        model = LSTMSpeakerEncoder(\n            config.model_params[\"input_dim\"],\n            config.model_params[\"proj_dim\"],\n            config.model_params[\"lstm_dim\"],\n            config.model_params[\"num_lstm_layers\"],\n            use_torch_spec=config.model_params.get(\"use_torch_spec\", False),\n            audio_config=config.audio,\n        )\n    elif config.model_params[\"model_name\"].lower() == \"resnet\":\n        model = ResNetSpeakerEncoder(\n            input_dim=config.model_params[\"input_dim\"],\n            proj_dim=config.model_params[\"proj_dim\"],\n            log_input=config.model_params.get(\"log_input\", False),\n            use_torch_spec=config.model_params.get(\"use_torch_spec\", False),\n            audio_config=config.audio,\n        )\n    return model\n\n\ndef save_checkpoint(model, optimizer, criterion, model_loss, out_path, current_step, epoch):\n    checkpoint_path = \"checkpoint_{}.pth\".format(current_step)\n    checkpoint_path = os.path.join(out_path, checkpoint_path)\n    print(\" | | > Checkpoint saving : {}\".format(checkpoint_path))\n\n    new_state_dict = model.state_dict()\n    state = {\n        \"model\": new_state_dict,\n        \"optimizer\": optimizer.state_dict() if optimizer is not None else None,\n        \"criterion\": criterion.state_dict(),\n        \"step\": current_step,\n        \"epoch\": epoch,\n        \"loss\": model_loss,\n        \"date\": datetime.date.today().strftime(\"%B %d, %Y\"),\n    }\n    save_fsspec(state, checkpoint_path)\n\n\ndef save_best_model(model, optimizer, criterion, model_loss, best_loss, out_path, current_step, epoch):\n    if model_loss < best_loss:\n        new_state_dict = model.state_dict()\n        state = {\n            \"model\": new_state_dict,\n            \"optimizer\": optimizer.state_dict(),\n            \"criterion\": criterion.state_dict(),\n            \"step\": current_step,\n            \"epoch\": epoch,\n            \"loss\": model_loss,\n            \"date\": datetime.date.today().strftime(\"%B %d, %Y\"),\n        }\n        best_loss = model_loss\n        bestmodel_path = \"best_model.pth\"\n        bestmodel_path = os.path.join(out_path, bestmodel_path)\n        print(\"\\n > BEST MODEL ({0:.5f}) : {1:}\".format(model_loss, bestmodel_path))\n        save_fsspec(state, bestmodel_path)\n    return best_loss\n"
  },
  {
    "path": "TTS/encoder/utils/io.py",
    "content": "import datetime\nimport os\n\nfrom TTS.utils.io import save_fsspec\n\n\ndef save_checkpoint(model, optimizer, model_loss, out_path, current_step):\n    checkpoint_path = \"checkpoint_{}.pth\".format(current_step)\n    checkpoint_path = os.path.join(out_path, checkpoint_path)\n    print(\" | | > Checkpoint saving : {}\".format(checkpoint_path))\n\n    new_state_dict = model.state_dict()\n    state = {\n        \"model\": new_state_dict,\n        \"optimizer\": optimizer.state_dict() if optimizer is not None else None,\n        \"step\": current_step,\n        \"loss\": model_loss,\n        \"date\": datetime.date.today().strftime(\"%B %d, %Y\"),\n    }\n    save_fsspec(state, checkpoint_path)\n\n\ndef save_best_model(model, optimizer, model_loss, best_loss, out_path, current_step):\n    if model_loss < best_loss:\n        new_state_dict = model.state_dict()\n        state = {\n            \"model\": new_state_dict,\n            \"optimizer\": optimizer.state_dict(),\n            \"step\": current_step,\n            \"loss\": model_loss,\n            \"date\": datetime.date.today().strftime(\"%B %d, %Y\"),\n        }\n        best_loss = model_loss\n        bestmodel_path = \"best_model.pth\"\n        bestmodel_path = os.path.join(out_path, bestmodel_path)\n        print(\"\\n > BEST MODEL ({0:.5f}) : {1:}\".format(model_loss, bestmodel_path))\n        save_fsspec(state, bestmodel_path)\n    return best_loss\n"
  },
  {
    "path": "TTS/encoder/utils/prepare_voxceleb.py",
    "content": "# coding=utf-8\n# Copyright (C) 2020 ATHENA AUTHORS; Yiping Peng; Ne Luo\n# All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n# Only support eager mode and TF>=2.0.0\n# pylint: disable=no-member, invalid-name, relative-beyond-top-level\n# pylint: disable=too-many-locals, too-many-statements, too-many-arguments, too-many-instance-attributes\n\"\"\" voxceleb 1 & 2 \"\"\"\n\nimport hashlib\nimport os\nimport subprocess\nimport sys\nimport zipfile\n\nimport pandas\nimport soundfile as sf\nfrom absl import logging\n\nSUBSETS = {\n    \"vox1_dev_wav\": [\n        \"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox1_dev_wav_partaa\",\n        \"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox1_dev_wav_partab\",\n        \"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox1_dev_wav_partac\",\n        \"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox1_dev_wav_partad\",\n    ],\n    \"vox1_test_wav\": [\"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox1_test_wav.zip\"],\n    \"vox2_dev_aac\": [\n        \"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox2_dev_aac_partaa\",\n        \"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox2_dev_aac_partab\",\n        \"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox2_dev_aac_partac\",\n        \"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox2_dev_aac_partad\",\n        \"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox2_dev_aac_partae\",\n        \"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox2_dev_aac_partaf\",\n        \"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox2_dev_aac_partag\",\n        \"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox2_dev_aac_partah\",\n    ],\n    \"vox2_test_aac\": [\"https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/vox2_test_aac.zip\"],\n}\n\nMD5SUM = {\n    \"vox1_dev_wav\": \"ae63e55b951748cc486645f532ba230b\",\n    \"vox2_dev_aac\": \"bbc063c46078a602ca71605645c2a402\",\n    \"vox1_test_wav\": \"185fdc63c3c739954633d50379a3d102\",\n    \"vox2_test_aac\": \"0d2b3ea430a821c33263b5ea37ede312\",\n}\n\nUSER = {\"user\": \"\", \"password\": \"\"}\n\nspeaker_id_dict = {}\n\n\ndef download_and_extract(directory, subset, urls):\n    \"\"\"Download and extract the given split of dataset.\n\n    Args:\n        directory: the directory where to put the downloaded data.\n        subset: subset name of the corpus.\n        urls: the list of urls to download the data file.\n    \"\"\"\n    os.makedirs(directory, exist_ok=True)\n\n    try:\n        for url in urls:\n            zip_filepath = os.path.join(directory, url.split(\"/\")[-1])\n            if os.path.exists(zip_filepath):\n                continue\n            logging.info(\"Downloading %s to %s\" % (url, zip_filepath))\n            subprocess.call(\n                \"wget %s --user %s --password %s -O %s\" % (url, USER[\"user\"], USER[\"password\"], zip_filepath),\n                shell=True,\n            )\n\n            statinfo = os.stat(zip_filepath)\n            logging.info(\"Successfully downloaded %s, size(bytes): %d\" % (url, statinfo.st_size))\n\n        # concatenate all parts into zip files\n        if \".zip\" not in zip_filepath:\n            zip_filepath = \"_\".join(zip_filepath.split(\"_\")[:-1])\n            subprocess.call(\"cat %s* > %s.zip\" % (zip_filepath, zip_filepath), shell=True)\n            zip_filepath += \".zip\"\n        extract_path = zip_filepath.strip(\".zip\")\n\n        # check zip file md5sum\n        with open(zip_filepath, \"rb\") as f_zip:\n            md5 = hashlib.md5(f_zip.read()).hexdigest()\n        if md5 != MD5SUM[subset]:\n            raise ValueError(\"md5sum of %s mismatch\" % zip_filepath)\n\n        with zipfile.ZipFile(zip_filepath, \"r\") as zfile:\n            zfile.extractall(directory)\n            extract_path_ori = os.path.join(directory, zfile.infolist()[0].filename)\n            subprocess.call(\"mv %s %s\" % (extract_path_ori, extract_path), shell=True)\n    finally:\n        # os.remove(zip_filepath)\n        pass\n\n\ndef exec_cmd(cmd):\n    \"\"\"Run a command in a subprocess.\n    Args:\n        cmd: command line to be executed.\n    Return:\n        int, the return code.\n    \"\"\"\n    try:\n        retcode = subprocess.call(cmd, shell=True)\n        if retcode < 0:\n            logging.info(f\"Child was terminated by signal {retcode}\")\n    except OSError as e:\n        logging.info(f\"Execution failed: {e}\")\n        retcode = -999\n    return retcode\n\n\ndef decode_aac_with_ffmpeg(aac_file, wav_file):\n    \"\"\"Decode a given AAC file into WAV using ffmpeg.\n    Args:\n        aac_file: file path to input AAC file.\n        wav_file: file path to output WAV file.\n    Return:\n        bool, True if success.\n    \"\"\"\n    cmd = f\"ffmpeg -i {aac_file} {wav_file}\"\n    logging.info(f\"Decoding aac file using command line: {cmd}\")\n    ret = exec_cmd(cmd)\n    if ret != 0:\n        logging.error(f\"Failed to decode aac file with retcode {ret}\")\n        logging.error(\"Please check your ffmpeg installation.\")\n        return False\n    return True\n\n\ndef convert_audio_and_make_label(input_dir, subset, output_dir, output_file):\n    \"\"\"Optionally convert AAC to WAV and make speaker labels.\n    Args:\n        input_dir: the directory which holds the input dataset.\n        subset: the name of the specified subset. e.g. vox1_dev_wav\n        output_dir: the directory to place the newly generated csv files.\n        output_file: the name of the newly generated csv file. e.g. vox1_dev_wav.csv\n    \"\"\"\n\n    logging.info(\"Preprocessing audio and label for subset %s\" % subset)\n    source_dir = os.path.join(input_dir, subset)\n\n    files = []\n    # Convert all AAC file into WAV format. At the same time, generate the csv\n    for root, _, filenames in os.walk(source_dir):\n        for filename in filenames:\n            name, ext = os.path.splitext(filename)\n            if ext.lower() == \".wav\":\n                _, ext2 = os.path.splitext(name)\n                if ext2:\n                    continue\n                wav_file = os.path.join(root, filename)\n            elif ext.lower() == \".m4a\":\n                # Convert AAC to WAV.\n                aac_file = os.path.join(root, filename)\n                wav_file = aac_file + \".wav\"\n                if not os.path.exists(wav_file):\n                    if not decode_aac_with_ffmpeg(aac_file, wav_file):\n                        raise RuntimeError(\"Audio decoding failed.\")\n            else:\n                continue\n            speaker_name = root.split(os.path.sep)[-2]\n            if speaker_name not in speaker_id_dict:\n                num = len(speaker_id_dict)\n                speaker_id_dict[speaker_name] = num\n            # wav_filesize = os.path.getsize(wav_file)\n            wav_length = len(sf.read(wav_file)[0])\n            files.append((os.path.abspath(wav_file), wav_length, speaker_id_dict[speaker_name], speaker_name))\n\n    # Write to CSV file which contains four columns:\n    # \"wav_filename\", \"wav_length_ms\", \"speaker_id\", \"speaker_name\".\n    csv_file_path = os.path.join(output_dir, output_file)\n    df = pandas.DataFrame(data=files, columns=[\"wav_filename\", \"wav_length_ms\", \"speaker_id\", \"speaker_name\"])\n    df.to_csv(csv_file_path, index=False, sep=\"\\t\")\n    logging.info(\"Successfully generated csv file {}\".format(csv_file_path))\n\n\ndef processor(directory, subset, force_process):\n    \"\"\"download and process\"\"\"\n    urls = SUBSETS\n    if subset not in urls:\n        raise ValueError(subset, \"is not in voxceleb\")\n\n    subset_csv = os.path.join(directory, subset + \".csv\")\n    if not force_process and os.path.exists(subset_csv):\n        return subset_csv\n\n    logging.info(\"Downloading and process the voxceleb in %s\", directory)\n    logging.info(\"Preparing subset %s\", subset)\n    download_and_extract(directory, subset, urls[subset])\n    convert_audio_and_make_label(directory, subset, directory, subset + \".csv\")\n    logging.info(\"Finished downloading and processing\")\n    return subset_csv\n\n\nif __name__ == \"__main__\":\n    logging.set_verbosity(logging.INFO)\n    if len(sys.argv) != 4:\n        print(\"Usage: python prepare_data.py save_directory user password\")\n        sys.exit()\n\n    DIR, USER[\"user\"], USER[\"password\"] = sys.argv[1], sys.argv[2], sys.argv[3]\n    for SUBSET in SUBSETS:\n        processor(DIR, SUBSET, False)\n"
  },
  {
    "path": "TTS/encoder/utils/training.py",
    "content": "import os\nfrom dataclasses import dataclass, field\n\nfrom coqpit import Coqpit\nfrom trainer import TrainerArgs, get_last_checkpoint\nfrom trainer.logging import logger_factory\nfrom trainer.logging.console_logger import ConsoleLogger\n\nfrom TTS.config import load_config, register_config\nfrom TTS.tts.utils.text.characters import parse_symbols\nfrom TTS.utils.generic_utils import get_experiment_folder_path, get_git_branch\nfrom TTS.utils.io import copy_model_files\n\n\n@dataclass\nclass TrainArgs(TrainerArgs):\n    config_path: str = field(default=None, metadata={\"help\": \"Path to the config file.\"})\n\n\ndef getarguments():\n    train_config = TrainArgs()\n    parser = train_config.init_argparse(arg_prefix=\"\")\n    return parser\n\n\ndef process_args(args, config=None):\n    \"\"\"Process parsed comand line arguments and initialize the config if not provided.\n    Args:\n        args (argparse.Namespace or dict like): Parsed input arguments.\n        config (Coqpit): Model config. If none, it is generated from `args`. Defaults to None.\n    Returns:\n        c (TTS.utils.io.AttrDict): Config paramaters.\n        out_path (str): Path to save models and logging.\n        audio_path (str): Path to save generated test audios.\n        c_logger (TTS.utils.console_logger.ConsoleLogger): Class that does\n            logging to the console.\n        dashboard_logger (WandbLogger or TensorboardLogger): Class that does the dashboard Logging\n    TODO:\n        - Interactive config definition.\n    \"\"\"\n    if isinstance(args, tuple):\n        args, coqpit_overrides = args\n    if args.continue_path:\n        # continue a previous training from its output folder\n        experiment_path = args.continue_path\n        args.config_path = os.path.join(args.continue_path, \"config.json\")\n        args.restore_path, best_model = get_last_checkpoint(args.continue_path)\n        if not args.best_path:\n            args.best_path = best_model\n    # init config if not already defined\n    if config is None:\n        if args.config_path:\n            # init from a file\n            config = load_config(args.config_path)\n        else:\n            # init from console args\n            from TTS.config.shared_configs import BaseTrainingConfig  # pylint: disable=import-outside-toplevel\n\n            config_base = BaseTrainingConfig()\n            config_base.parse_known_args(coqpit_overrides)\n            config = register_config(config_base.model)()\n    # override values from command-line args\n    config.parse_known_args(coqpit_overrides, relaxed_parser=True)\n    experiment_path = args.continue_path\n    if not experiment_path:\n        experiment_path = get_experiment_folder_path(config.output_path, config.run_name)\n    audio_path = os.path.join(experiment_path, \"test_audios\")\n    config.output_log_path = experiment_path\n    # setup rank 0 process in distributed training\n    dashboard_logger = None\n    if args.rank == 0:\n        new_fields = {}\n        if args.restore_path:\n            new_fields[\"restore_path\"] = args.restore_path\n        new_fields[\"github_branch\"] = get_git_branch()\n        # if model characters are not set in the config file\n        # save the default set to the config file for future\n        # compatibility.\n        if config.has(\"characters\") and config.characters is None:\n            used_characters = parse_symbols()\n            new_fields[\"characters\"] = used_characters\n        copy_model_files(config, experiment_path, new_fields)\n        dashboard_logger = logger_factory(config, experiment_path)\n    c_logger = ConsoleLogger()\n    return config, experiment_path, audio_path, c_logger, dashboard_logger\n\n\ndef init_arguments():\n    train_config = TrainArgs()\n    parser = train_config.init_argparse(arg_prefix=\"\")\n    return parser\n\n\ndef init_training(config: Coqpit = None):\n    \"\"\"Initialization of a training run.\"\"\"\n    parser = init_arguments()\n    args = parser.parse_known_args()\n    config, OUT_PATH, AUDIO_PATH, c_logger, dashboard_logger = process_args(args, config)\n    return args[0], config, OUT_PATH, AUDIO_PATH, c_logger, dashboard_logger\n"
  },
  {
    "path": "TTS/encoder/utils/visual.py",
    "content": "import matplotlib\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport umap\n\nmatplotlib.use(\"Agg\")\n\n\ncolormap = (\n    np.array(\n        [\n            [76, 255, 0],\n            [0, 127, 70],\n            [255, 0, 0],\n            [255, 217, 38],\n            [0, 135, 255],\n            [165, 0, 165],\n            [255, 167, 255],\n            [0, 255, 255],\n            [255, 96, 38],\n            [142, 76, 0],\n            [33, 0, 127],\n            [0, 0, 0],\n            [183, 183, 183],\n        ],\n        dtype=np.float,\n    )\n    / 255\n)\n\n\ndef plot_embeddings(embeddings, num_classes_in_batch):\n    num_utter_per_class = embeddings.shape[0] // num_classes_in_batch\n\n    # if necessary get just the first 10 classes\n    if num_classes_in_batch > 10:\n        num_classes_in_batch = 10\n        embeddings = embeddings[: num_classes_in_batch * num_utter_per_class]\n\n    model = umap.UMAP()\n    projection = model.fit_transform(embeddings)\n    ground_truth = np.repeat(np.arange(num_classes_in_batch), num_utter_per_class)\n    colors = [colormap[i] for i in ground_truth]\n    fig, ax = plt.subplots(figsize=(16, 10))\n    _ = ax.scatter(projection[:, 0], projection[:, 1], c=colors)\n    plt.gca().set_aspect(\"equal\", \"datalim\")\n    plt.title(\"UMAP projection\")\n    plt.tight_layout()\n    plt.savefig(\"umap\")\n    return fig\n"
  },
  {
    "path": "TTS/model.py",
    "content": "from abc import abstractmethod\nfrom typing import Dict\n\nimport torch\nfrom coqpit import Coqpit\nfrom trainer import TrainerModel\n\n# pylint: skip-file\n\n\nclass BaseTrainerModel(TrainerModel):\n    \"\"\"BaseTrainerModel model expanding TrainerModel with required functions by 🐸TTS.\n\n    Every new 🐸TTS model must inherit it.\n    \"\"\"\n\n    @staticmethod\n    @abstractmethod\n    def init_from_config(config: Coqpit):\n        \"\"\"Init the model and all its attributes from the given config.\n\n        Override this depending on your model.\n        \"\"\"\n        ...\n\n    @abstractmethod\n    def inference(self, input: torch.Tensor, aux_input={}) -> Dict:\n        \"\"\"Forward pass for inference.\n\n        It must return a dictionary with the main model output and all the auxiliary outputs. The key ```model_outputs```\n        is considered to be the main output and you can add any other auxiliary outputs as you want.\n\n        We don't use `*kwargs` since it is problematic with the TorchScript API.\n\n        Args:\n            input (torch.Tensor): [description]\n            aux_input (Dict): Auxiliary inputs like speaker embeddings, durations etc.\n\n        Returns:\n            Dict: [description]\n        \"\"\"\n        outputs_dict = {\"model_outputs\": None}\n        ...\n        return outputs_dict\n\n    @abstractmethod\n    def load_checkpoint(\n        self, config: Coqpit, checkpoint_path: str, eval: bool = False, strict: bool = True, cache=False\n    ) -> None:\n        \"\"\"Load a model checkpoint gile and get ready for training or inference.\n\n        Args:\n            config (Coqpit): Model configuration.\n            checkpoint_path (str): Path to the model checkpoint file.\n            eval (bool, optional): If true, init model for inference else for training. Defaults to False.\n            strict (bool, optional): Match all checkpoint keys to model's keys. Defaults to True.\n            cache (bool, optional): If True, cache the file locally for subsequent calls. It is cached under `get_user_data_dir()/tts_cache`. Defaults to False.\n        \"\"\"\n        ...\n"
  },
  {
    "path": "TTS/server/README.md",
    "content": "# :frog: TTS demo server\nBefore you use the server, make sure you [install](https://github.com/coqui-ai/TTS/tree/dev#install-tts)) :frog: TTS properly. Then, you can follow the steps below.\n\n**Note:** If you install :frog:TTS using ```pip```, you can also use the ```tts-server``` end point on the terminal.\n\nExamples runs:\n\nList officially released models.\n```python TTS/server/server.py  --list_models ```\n\nRun the server with the official models.\n```python TTS/server/server.py  --model_name tts_models/en/ljspeech/tacotron2-DCA --vocoder_name vocoder_models/en/ljspeech/multiband-melgan```\n\nRun the server with the official models on a GPU.\n```CUDA_VISIBLE_DEVICES=\"0\" python TTS/server/server.py  --model_name tts_models/en/ljspeech/tacotron2-DCA --vocoder_name vocoder_models/en/ljspeech/multiband-melgan --use_cuda True```\n\nRun the server with a custom models.\n```python TTS/server/server.py  --tts_checkpoint /path/to/tts/model.pth --tts_config /path/to/tts/config.json --vocoder_checkpoint /path/to/vocoder/model.pth --vocoder_config /path/to/vocoder/config.json```\n"
  },
  {
    "path": "TTS/server/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/server/conf.json",
    "content": "{\n    \"tts_path\":\"/media/erogol/data_ssd/Models/libri_tts/5049/\",  // tts model root folder\n    \"tts_file\":\"best_model.pth\",     // tts checkpoint file\n    \"tts_config\":\"config.json\",     // tts config.json file\n    \"tts_speakers\": null,           // json file listing speaker ids. null if no speaker embedding.\n    \"vocoder_config\":null,\n    \"vocoder_file\": null,\n    \"is_wavernn_batched\":true,\n    \"port\": 5002,\n    \"use_cuda\": true,\n    \"debug\": true\n}\n"
  },
  {
    "path": "TTS/server/server.py",
    "content": "#!flask/bin/python\nimport argparse\nimport io\nimport json\nimport os\nimport sys\nfrom pathlib import Path\nfrom threading import Lock\nfrom typing import Union\nfrom urllib.parse import parse_qs\n\nfrom flask import Flask, render_template, render_template_string, request, send_file\n\nfrom TTS.config import load_config\nfrom TTS.utils.manage import ModelManager\nfrom TTS.utils.synthesizer import Synthesizer\n\n\ndef create_argparser():\n    def convert_boolean(x):\n        return x.lower() in [\"true\", \"1\", \"yes\"]\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--list_models\",\n        type=convert_boolean,\n        nargs=\"?\",\n        const=True,\n        default=False,\n        help=\"list available pre-trained tts and vocoder models.\",\n    )\n    parser.add_argument(\n        \"--model_name\",\n        type=str,\n        default=\"tts_models/en/ljspeech/tacotron2-DDC\",\n        help=\"Name of one of the pre-trained tts models in format <language>/<dataset>/<model_name>\",\n    )\n    parser.add_argument(\"--vocoder_name\", type=str, default=None, help=\"name of one of the released vocoder models.\")\n\n    # Args for running custom models\n    parser.add_argument(\"--config_path\", default=None, type=str, help=\"Path to model config file.\")\n    parser.add_argument(\n        \"--model_path\",\n        type=str,\n        default=None,\n        help=\"Path to model file.\",\n    )\n    parser.add_argument(\n        \"--vocoder_path\",\n        type=str,\n        help=\"Path to vocoder model file. If it is not defined, model uses GL as vocoder. Please make sure that you installed vocoder library before (WaveRNN).\",\n        default=None,\n    )\n    parser.add_argument(\"--vocoder_config_path\", type=str, help=\"Path to vocoder model config file.\", default=None)\n    parser.add_argument(\"--speakers_file_path\", type=str, help=\"JSON file for multi-speaker model.\", default=None)\n    parser.add_argument(\"--port\", type=int, default=5002, help=\"port to listen on.\")\n    parser.add_argument(\"--use_cuda\", type=convert_boolean, default=False, help=\"true to use CUDA.\")\n    parser.add_argument(\"--debug\", type=convert_boolean, default=False, help=\"true to enable Flask debug mode.\")\n    parser.add_argument(\"--show_details\", type=convert_boolean, default=False, help=\"Generate model detail page.\")\n    return parser\n\n\n# parse the args\nargs = create_argparser().parse_args()\n\npath = Path(__file__).parent / \"../.models.json\"\nmanager = ModelManager(path)\n\nif args.list_models:\n    manager.list_models()\n    sys.exit()\n\n# update in-use models to the specified released models.\nmodel_path = None\nconfig_path = None\nspeakers_file_path = None\nvocoder_path = None\nvocoder_config_path = None\n\n# CASE1: list pre-trained TTS models\nif args.list_models:\n    manager.list_models()\n    sys.exit()\n\n# CASE2: load pre-trained model paths\nif args.model_name is not None and not args.model_path:\n    model_path, config_path, model_item = manager.download_model(args.model_name)\n    args.vocoder_name = model_item[\"default_vocoder\"] if args.vocoder_name is None else args.vocoder_name\n\nif args.vocoder_name is not None and not args.vocoder_path:\n    vocoder_path, vocoder_config_path, _ = manager.download_model(args.vocoder_name)\n\n# CASE3: set custom model paths\nif args.model_path is not None:\n    model_path = args.model_path\n    config_path = args.config_path\n    speakers_file_path = args.speakers_file_path\n\nif args.vocoder_path is not None:\n    vocoder_path = args.vocoder_path\n    vocoder_config_path = args.vocoder_config_path\n\n# load models\nsynthesizer = Synthesizer(\n    tts_checkpoint=model_path,\n    tts_config_path=config_path,\n    tts_speakers_file=speakers_file_path,\n    tts_languages_file=None,\n    vocoder_checkpoint=vocoder_path,\n    vocoder_config=vocoder_config_path,\n    encoder_checkpoint=\"\",\n    encoder_config=\"\",\n    use_cuda=args.use_cuda,\n)\n\nuse_multi_speaker = hasattr(synthesizer.tts_model, \"num_speakers\") and (\n    synthesizer.tts_model.num_speakers > 1 or synthesizer.tts_speakers_file is not None\n)\nspeaker_manager = getattr(synthesizer.tts_model, \"speaker_manager\", None)\n\nuse_multi_language = hasattr(synthesizer.tts_model, \"num_languages\") and (\n    synthesizer.tts_model.num_languages > 1 or synthesizer.tts_languages_file is not None\n)\nlanguage_manager = getattr(synthesizer.tts_model, \"language_manager\", None)\n\n# TODO: set this from SpeakerManager\nuse_gst = synthesizer.tts_config.get(\"use_gst\", False)\napp = Flask(__name__)\n\n\ndef style_wav_uri_to_dict(style_wav: str) -> Union[str, dict]:\n    \"\"\"Transform an uri style_wav, in either a string (path to wav file to be use for style transfer)\n    or a dict (gst tokens/values to be use for styling)\n\n    Args:\n        style_wav (str): uri\n\n    Returns:\n        Union[str, dict]: path to file (str) or gst style (dict)\n    \"\"\"\n    if style_wav:\n        if os.path.isfile(style_wav) and style_wav.endswith(\".wav\"):\n            return style_wav  # style_wav is a .wav file located on the server\n\n        style_wav = json.loads(style_wav)\n        return style_wav  # style_wav is a gst dictionary with {token1_id : token1_weigth, ...}\n    return None\n\n\n@app.route(\"/\")\ndef index():\n    return render_template(\n        \"index.html\",\n        show_details=args.show_details,\n        use_multi_speaker=use_multi_speaker,\n        use_multi_language=use_multi_language,\n        speaker_ids=speaker_manager.name_to_id if speaker_manager is not None else None,\n        language_ids=language_manager.name_to_id if language_manager is not None else None,\n        use_gst=use_gst,\n    )\n\n\n@app.route(\"/details\")\ndef details():\n    model_config = load_config(args.tts_config)\n    if args.vocoder_config is not None and os.path.isfile(args.vocoder_config):\n        vocoder_config = load_config(args.vocoder_config)\n    else:\n        vocoder_config = None\n\n    return render_template(\n        \"details.html\",\n        show_details=args.show_details,\n        model_config=model_config,\n        vocoder_config=vocoder_config,\n        args=args.__dict__,\n    )\n\n\nlock = Lock()\n\n\n@app.route(\"/api/tts\", methods=[\"GET\"])\ndef tts():\n    with lock:\n        text = request.args.get(\"text\")\n        speaker_idx = request.args.get(\"speaker_id\", \"\")\n        language_idx = request.args.get(\"language_id\", \"\")\n        style_wav = request.args.get(\"style_wav\", \"\")\n        style_wav = style_wav_uri_to_dict(style_wav)\n        print(f\" > Model input: {text}\")\n        print(f\" > Speaker Idx: {speaker_idx}\")\n        print(f\" > Language Idx: {language_idx}\")\n        wavs = synthesizer.tts(text, speaker_name=speaker_idx, language_name=language_idx, style_wav=style_wav)\n        out = io.BytesIO()\n        synthesizer.save_wav(wavs, out)\n    return send_file(out, mimetype=\"audio/wav\")\n\n\n# Basic MaryTTS compatibility layer\n\n\n@app.route(\"/locales\", methods=[\"GET\"])\ndef mary_tts_api_locales():\n    \"\"\"MaryTTS-compatible /locales endpoint\"\"\"\n    # NOTE: We currently assume there is only one model active at the same time\n    if args.model_name is not None:\n        model_details = args.model_name.split(\"/\")\n    else:\n        model_details = [\"\", \"en\", \"\", \"default\"]\n    return render_template_string(\"{{ locale }}\\n\", locale=model_details[1])\n\n\n@app.route(\"/voices\", methods=[\"GET\"])\ndef mary_tts_api_voices():\n    \"\"\"MaryTTS-compatible /voices endpoint\"\"\"\n    # NOTE: We currently assume there is only one model active at the same time\n    if args.model_name is not None:\n        model_details = args.model_name.split(\"/\")\n    else:\n        model_details = [\"\", \"en\", \"\", \"default\"]\n    return render_template_string(\n        \"{{ name }} {{ locale }} {{ gender }}\\n\", name=model_details[3], locale=model_details[1], gender=\"u\"\n    )\n\n\n@app.route(\"/process\", methods=[\"GET\", \"POST\"])\ndef mary_tts_api_process():\n    \"\"\"MaryTTS-compatible /process endpoint\"\"\"\n    with lock:\n        if request.method == \"POST\":\n            data = parse_qs(request.get_data(as_text=True))\n            # NOTE: we ignore param. LOCALE and VOICE for now since we have only one active model\n            text = data.get(\"INPUT_TEXT\", [\"\"])[0]\n        else:\n            text = request.args.get(\"INPUT_TEXT\", \"\")\n        print(f\" > Model input: {text}\")\n        wavs = synthesizer.tts(text)\n        out = io.BytesIO()\n        synthesizer.save_wav(wavs, out)\n    return send_file(out, mimetype=\"audio/wav\")\n\n\ndef main():\n    app.run(debug=args.debug, host=\"::\", port=args.port)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "TTS/server/templates/details.html",
    "content": "<!DOCTYPE html>\n<html lang=\"en\">\n\n<head>\n\n  <meta charset=\"utf-8\">\n  <meta name=\"viewport\" content=\"width=device-width, initial-scale=1, shrink-to-fit=no\">\n  <meta name=\"description\" content=\"\">\n  <meta name=\"author\" content=\"\">\n\n  <title>TTS engine</title>\n\n  <!-- Bootstrap core CSS -->\n  <link href=\"https://stackpath.bootstrapcdn.com/bootstrap/4.1.1/css/bootstrap.min.css\"\n    integrity=\"sha384-WskhaSGFgHYWDcbwN70/dfYBj47jz9qbsMId/iRN3ewGhXQFZCSftd1LZCfmhktB\" crossorigin=\"anonymous\"\n    rel=\"stylesheet\">\n\n  <!-- Custom styles for this template -->\n  <style>\n    body {\n      padding-top: 54px;\n    }\n\n    @media (min-width: 992px) {\n      body {\n        padding-top: 56px;\n      }\n    }\n  </style>\n</head>\n\n<body>\n  <a href=\"https://github.com/mozilla/TTS\"><img style=\"position: absolute; z-index:1000; top: 0; left: 0; border: 0;\"\n      src=\"https://s3.amazonaws.com/github/ribbons/forkme_left_darkblue_121621.png\" alt=\"Fork me on GitHub\"></a>\n\n  {% if show_details == true %}\n\n  <div class=\"container\">\n    <b>Model details</b>\n  </div>\n\n  <div class=\"container\">\n    <details>\n      <summary>CLI arguments:</summary>\n      <table border=\"1\" align=\"center\" width=\"75%\">\n        <tr>\n          <td> CLI key </td>\n          <td> Value </td>\n        </tr>\n\n        {% for key, value in args.items() %}\n\n        <tr>\n          <td>{{ key }}</td>\n          <td>{{ value }}</td>\n        </tr>\n\n        {% endfor %}\n      </table>\n    </details>\n  </div></br>\n\n  <div class=\"container\">\n\n    {% if model_config != None %}\n\n    <details>\n      <summary>Model config:</summary>\n\n      <table border=\"1\" align=\"center\" width=\"75%\">\n        <tr>\n          <td> Key </td>\n          <td> Value </td>\n        </tr>\n\n\n        {% for key, value in model_config.items() %}\n\n        <tr>\n          <td>{{ key }}</td>\n          <td>{{ value }}</td>\n        </tr>\n\n        {% endfor %}\n\n      </table>\n    </details>\n\n    {% endif %}\n\n  </div></br>\n\n\n\n  <div class=\"container\">\n    {% if vocoder_config != None %}\n    <details>\n      <summary>Vocoder model config:</summary>\n\n      <table border=\"1\" align=\"center\" width=\"75%\">\n        <tr>\n          <td> Key </td>\n          <td> Value </td>\n        </tr>\n\n\n        {% for key, value in vocoder_config.items() %}\n\n        <tr>\n          <td>{{ key }}</td>\n          <td>{{ value }}</td>\n        </tr>\n\n        {% endfor %}\n\n\n      </table>\n    </details>\n    {% endif %}\n  </div></br>\n\n  {% else %}\n  <div class=\"container\">\n    <b>Please start server with --show_details=true to see details.</b>\n  </div>\n\n  {% endif %}\n\n</body>\n\n</html>"
  },
  {
    "path": "TTS/server/templates/index.html",
    "content": "<!DOCTYPE html>\n<html lang=\"en\">\n\n<head>\n\n    <meta charset=\"utf-8\">\n    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1, shrink-to-fit=no\">\n    <meta name=\"description\" content=\"🐸Coqui AI TTS demo server.\">\n    <meta name=\"author\" content=\"🐸Coqui AI TTS\">\n\n    <title>TTS engine</title>\n\n    <!-- Bootstrap core CSS -->\n    <link href=\"https://stackpath.bootstrapcdn.com/bootstrap/4.1.1/css/bootstrap.min.css\"\n        integrity=\"sha384-WskhaSGFgHYWDcbwN70/dfYBj47jz9qbsMId/iRN3ewGhXQFZCSftd1LZCfmhktB\" crossorigin=\"anonymous\"\n        rel=\"stylesheet\">\n\n    <!-- Custom styles for this template -->\n    <style>\n        body {\n            padding-top: 54px;\n        }\n\n        @media (min-width: 992px) {\n            body {\n                padding-top: 56px;\n            }\n        }\n    </style>\n</head>\n\n<body>\n    <a href=\"https://github.com/coqui-ai/TTS\"><img style=\"position: absolute; z-index:1000; top: 0; left: 0; border: 0;\"\n            src=\"https://s3.amazonaws.com/github/ribbons/forkme_left_darkblue_121621.png\" alt=\"Fork me on GitHub\"></a>\n\n    <!-- Navigation -->\n    <!--\n    <nav class=\"navbar navbar-expand-lg navbar-dark bg-dark fixed-top\">\n      <div class=\"container\">\n        <a class=\"navbar-brand\" href=\"#\">Coqui TTS</a>\n        <button class=\"navbar-toggler\" type=\"button\" data-toggle=\"collapse\" data-target=\"#navbarResponsive\" aria-controls=\"navbarResponsive\" aria-expanded=\"false\" aria-label=\"Toggle navigation\">\n          <span class=\"navbar-toggler-icon\"></span>\n        </button>\n        <div class=\"collapse navbar-collapse\" id=\"navbarResponsive\">\n          <ul class=\"navbar-nav ml-auto\">\n            <li class=\"nav-item active\">\n              <a class=\"nav-link\" href=\"#\">Home\n                <span class=\"sr-only\">(current)</span>\n              </a>\n            </li>\n          </ul>\n        </div>\n      </div>\n    </nav>\n    -->\n\n    <!-- Page Content -->\n    <div class=\"container\">\n        <div class=\"row\">\n            <div class=\"col-lg-12 text-center\">\n                <img class=\"mt-5\" src=\"{{url_for('static', filename='coqui-log-green-TTS.png')}}\" align=\"middle\"\n                    width=\"512\" />\n\n                <ul class=\"list-unstyled\">\n                </ul>\n\n                {%if use_gst%}\n                <input value='{\"0\": 0.1}' id=\"style_wav\" placeholder=\"style wav (dict or path to wav)..\" size=45\n                    type=\"text\" name=\"style_wav\">\n                {%endif%}\n\n                <input id=\"text\" placeholder=\"Type here...\" size=45 type=\"text\" name=\"text\">\n                <button id=\"speak-button\" name=\"speak\">Speak</button><br /><br />\n\n                {%if use_multi_speaker%}\n                Choose a speaker:\n                <select id=\"speaker_id\" name=speaker_id method=\"GET\" action=\"/\">\n                    {% for speaker_id in speaker_ids %}\n                    <option value=\"{{speaker_id}}\" SELECTED>{{speaker_id}}</option>\"\n                    {% endfor %}\n                </select><br /><br />\n                {%endif%}\n\n                {%if use_multi_language%}\n                Choose a language:\n                <select id=\"language_id\" name=language_id method=\"GET\" action=\"/\">\n                    {% for language_id in language_ids %}\n                    <option value=\"{{language_id}}\" SELECTED>{{language_id}}</option>\"\n                    {% endfor %}\n                </select><br /><br />\n                {%endif%}\n\n\n                {%if show_details%}\n                <button id=\"details-button\" onclick=\"location.href = 'details'\" name=\"model-details\">Model\n                    Details</button><br /><br />\n                {%endif%}\n                <audio id=\"audio\" controls autoplay hidden></audio>\n                <p id=\"message\"></p>\n            </div>\n        </div>\n    </div>\n\n    <!-- Bootstrap core JavaScript -->\n    <script>\n        function getTextValue(textId) {\n            const container = q(textId)\n            if (container) {\n                return container.value\n            }\n            return \"\"\n        }\n        function q(selector) { return document.querySelector(selector) }\n        q('#text').focus()\n        function do_tts(e) {\n            const text = q('#text').value\n            const speaker_id = getTextValue('#speaker_id')\n            const style_wav = getTextValue('#style_wav')\n            const language_id = getTextValue('#language_id')\n            if (text) {\n                q('#message').textContent = 'Synthesizing...'\n                q('#speak-button').disabled = true\n                q('#audio').hidden = true\n                synthesize(text, speaker_id, style_wav, language_id)\n            }\n            e.preventDefault()\n            return false\n        }\n        q('#speak-button').addEventListener('click', do_tts)\n        q('#text').addEventListener('keyup', function (e) {\n            if (e.keyCode == 13) { // enter\n                do_tts(e)\n            }\n        })\n        function synthesize(text, speaker_id = \"\", style_wav = \"\", language_id = \"\") {\n            fetch(`/api/tts?text=${encodeURIComponent(text)}&speaker_id=${encodeURIComponent(speaker_id)}&style_wav=${encodeURIComponent(style_wav)}&language_id=${encodeURIComponent(language_id)}`, { cache: 'no-cache' })\n                .then(function (res) {\n                    if (!res.ok) throw Error(res.statusText)\n                    return res.blob()\n                }).then(function (blob) {\n                    q('#message').textContent = ''\n                    q('#speak-button').disabled = false\n                    q('#audio').src = URL.createObjectURL(blob)\n                    q('#audio').hidden = false\n                }).catch(function (err) {\n                    q('#message').textContent = 'Error: ' + err.message\n                    q('#speak-button').disabled = false\n                })\n        }\n    </script>\n\n</body>\n\n</html>"
  },
  {
    "path": "TTS/tts/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/configs/__init__.py",
    "content": "import importlib\nimport os\nfrom inspect import isclass\n\n# import all files under configs/\n# configs_dir = os.path.dirname(__file__)\n# for file in os.listdir(configs_dir):\n#     path = os.path.join(configs_dir, file)\n#     if not file.startswith(\"_\") and not file.startswith(\".\") and (file.endswith(\".py\") or os.path.isdir(path)):\n#         config_name = file[: file.find(\".py\")] if file.endswith(\".py\") else file\n#         module = importlib.import_module(\"TTS.tts.configs.\" + config_name)\n#         for attribute_name in dir(module):\n#             attribute = getattr(module, attribute_name)\n\n#             if isclass(attribute):\n#                 # Add the class to this package's variables\n#                 globals()[attribute_name] = attribute\n"
  },
  {
    "path": "TTS/tts/configs/align_tts_config.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import List\n\nfrom TTS.tts.configs.shared_configs import BaseTTSConfig\nfrom TTS.tts.models.align_tts import AlignTTSArgs\n\n\n@dataclass\nclass AlignTTSConfig(BaseTTSConfig):\n    \"\"\"Defines parameters for AlignTTS model.\n    Example:\n\n        >>> from TTS.tts.configs.align_tts_config import AlignTTSConfig\n        >>> config = AlignTTSConfig()\n\n    Args:\n        model(str):\n            Model name used for selecting the right model at initialization. Defaults to `align_tts`.\n        positional_encoding (bool):\n            enable / disable positional encoding applied to the encoder output. Defaults to True.\n        hidden_channels (int):\n            Base number of hidden channels. Defines all the layers expect ones defined by the specific encoder or decoder\n            parameters. Defaults to 256.\n        hidden_channels_dp (int):\n            Number of hidden channels of the duration predictor's layers. Defaults to 256.\n        encoder_type (str):\n            Type of the encoder used by the model. Look at `TTS.tts.layers.feed_forward.encoder` for more details.\n            Defaults to `fftransformer`.\n        encoder_params (dict):\n            Parameters used to define the encoder network. Look at `TTS.tts.layers.feed_forward.encoder` for more details.\n            Defaults to `{\"hidden_channels_ffn\": 1024, \"num_heads\": 2, \"num_layers\": 6, \"dropout_p\": 0.1}`.\n        decoder_type (str):\n            Type of the decoder used by the model. Look at `TTS.tts.layers.feed_forward.decoder` for more details.\n            Defaults to `fftransformer`.\n        decoder_params (dict):\n            Parameters used to define the decoder network. Look at `TTS.tts.layers.feed_forward.decoder` for more details.\n            Defaults to `{\"hidden_channels_ffn\": 1024, \"num_heads\": 2, \"num_layers\": 6, \"dropout_p\": 0.1}`.\n        phase_start_steps (List[int]):\n            A list of number of steps required to start the next training phase. AlignTTS has 4 different training\n            phases. Thus you need to define 4 different values to enable phase based training. If None, it\n            trains the whole model together. Defaults to None.\n        ssim_alpha (float):\n            Weight for the SSIM loss. If set <= 0, disables the SSIM loss. Defaults to 1.0.\n        duration_loss_alpha (float):\n            Weight for the duration predictor's loss. Defaults to 1.0.\n        mdn_alpha (float):\n            Weight for the MDN loss. Defaults to 1.0.\n        spec_loss_alpha (float):\n            Weight for the MSE spectrogram loss. If set <= 0, disables the L1 loss. Defaults to 1.0.\n        use_speaker_embedding (bool):\n            enable / disable using speaker embeddings for multi-speaker models. If set True, the model is\n            in the multi-speaker mode. Defaults to False.\n        use_d_vector_file (bool):\n            enable /disable using external speaker embeddings in place of the learned embeddings. Defaults to False.\n        d_vector_file (str):\n            Path to the file including pre-computed speaker embeddings. Defaults to None.\n        noam_schedule (bool):\n            enable / disable the use of Noam LR scheduler. Defaults to False.\n        warmup_steps (int):\n            Number of warm-up steps for the Noam scheduler. Defaults 4000.\n        lr (float):\n            Initial learning rate. Defaults to `1e-3`.\n        wd (float):\n            Weight decay coefficient. Defaults to `1e-7`.\n        min_seq_len (int):\n            Minimum input sequence length to be used at training.\n        max_seq_len (int):\n            Maximum input sequence length to be used at training. Larger values result in more VRAM usage.\"\"\"\n\n    model: str = \"align_tts\"\n    # model specific params\n    model_args: AlignTTSArgs = field(default_factory=AlignTTSArgs)\n    phase_start_steps: List[int] = None\n\n    ssim_alpha: float = 1.0\n    spec_loss_alpha: float = 1.0\n    dur_loss_alpha: float = 1.0\n    mdn_alpha: float = 1.0\n\n    # multi-speaker settings\n    use_speaker_embedding: bool = False\n    use_d_vector_file: bool = False\n    d_vector_file: str = False\n\n    # optimizer parameters\n    optimizer: str = \"Adam\"\n    optimizer_params: dict = field(default_factory=lambda: {\"betas\": [0.9, 0.998], \"weight_decay\": 1e-6})\n    lr_scheduler: str = None\n    lr_scheduler_params: dict = None\n    lr: float = 1e-4\n    grad_clip: float = 5.0\n\n    # overrides\n    min_seq_len: int = 13\n    max_seq_len: int = 200\n    r: int = 1\n\n    # testing\n    test_sentences: List[str] = field(\n        default_factory=lambda: [\n            \"It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.\",\n            \"Be a voice, not an echo.\",\n            \"I'm sorry Dave. I'm afraid I can't do that.\",\n            \"This cake is great. It's so delicious and moist.\",\n            \"Prior to November 22, 1963.\",\n        ]\n    )\n"
  },
  {
    "path": "TTS/tts/configs/fast_pitch_config.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import List\n\nfrom TTS.tts.configs.shared_configs import BaseTTSConfig\nfrom TTS.tts.models.forward_tts import ForwardTTSArgs\n\n\n@dataclass\nclass FastPitchConfig(BaseTTSConfig):\n    \"\"\"Configure `ForwardTTS` as FastPitch model.\n\n    Example:\n\n        >>> from TTS.tts.configs.fast_pitch_config import FastPitchConfig\n        >>> config = FastPitchConfig()\n\n    Args:\n        model (str):\n            Model name used for selecting the right model at initialization. Defaults to `fast_pitch`.\n\n        base_model (str):\n            Name of the base model being configured as this model so that 🐸 TTS knows it needs to initiate\n            the base model rather than searching for the `model` implementation. Defaults to `forward_tts`.\n\n        model_args (Coqpit):\n            Model class arguments. Check `FastPitchArgs` for more details. Defaults to `FastPitchArgs()`.\n\n        data_dep_init_steps (int):\n            Number of steps used for computing normalization parameters at the beginning of the training. GlowTTS uses\n            Activation Normalization that pre-computes normalization stats at the beginning and use the same values\n            for the rest. Defaults to 10.\n\n        speakers_file (str):\n            Path to the file containing the list of speakers. Needed at inference for loading matching speaker ids to\n            speaker names. Defaults to `None`.\n\n        use_speaker_embedding (bool):\n            enable / disable using speaker embeddings for multi-speaker models. If set True, the model is\n            in the multi-speaker mode. Defaults to False.\n\n        use_d_vector_file (bool):\n            enable /disable using external speaker embeddings in place of the learned embeddings. Defaults to False.\n\n        d_vector_file (str):\n            Path to the file including pre-computed speaker embeddings. Defaults to None.\n\n        d_vector_dim (int):\n            Dimension of the external speaker embeddings. Defaults to 0.\n\n        optimizer (str):\n            Name of the model optimizer. Defaults to `Adam`.\n\n        optimizer_params (dict):\n            Arguments of the model optimizer. Defaults to `{\"betas\": [0.9, 0.998], \"weight_decay\": 1e-6}`.\n\n        lr_scheduler (str):\n            Name of the learning rate scheduler. Defaults to `Noam`.\n\n        lr_scheduler_params (dict):\n            Arguments of the learning rate scheduler. Defaults to `{\"warmup_steps\": 4000}`.\n\n        lr (float):\n            Initial learning rate. Defaults to `1e-3`.\n\n        grad_clip (float):\n            Gradient norm clipping value. Defaults to `5.0`.\n\n        spec_loss_type (str):\n            Type of the spectrogram loss. Check `ForwardTTSLoss` for possible values. Defaults to `mse`.\n\n        duration_loss_type (str):\n            Type of the duration loss. Check `ForwardTTSLoss` for possible values. Defaults to `mse`.\n\n        use_ssim_loss (bool):\n            Enable/disable the use of SSIM (Structural Similarity) loss. Defaults to True.\n\n        wd (float):\n            Weight decay coefficient. Defaults to `1e-7`.\n\n        ssim_loss_alpha (float):\n            Weight for the SSIM loss. If set 0, disables the SSIM loss. Defaults to 1.0.\n\n        dur_loss_alpha (float):\n            Weight for the duration predictor's loss. If set 0, disables the huber loss. Defaults to 1.0.\n\n        spec_loss_alpha (float):\n            Weight for the L1 spectrogram loss. If set 0, disables the L1 loss. Defaults to 1.0.\n\n        pitch_loss_alpha (float):\n            Weight for the pitch predictor's loss. If set 0, disables the pitch predictor. Defaults to 1.0.\n\n        binary_align_loss_alpha (float):\n            Weight for the binary loss. If set 0, disables the binary loss. Defaults to 1.0.\n\n        binary_loss_warmup_epochs (float):\n            Number of epochs to gradually increase the binary loss impact. Defaults to 150.\n\n        min_seq_len (int):\n            Minimum input sequence length to be used at training.\n\n        max_seq_len (int):\n            Maximum input sequence length to be used at training. Larger values result in more VRAM usage.\n\n        # dataset configs\n        compute_f0(bool):\n            Compute pitch. defaults to True\n\n        f0_cache_path(str):\n            pith cache path. defaults to None\n    \"\"\"\n\n    model: str = \"fast_pitch\"\n    base_model: str = \"forward_tts\"\n\n    # model specific params\n    model_args: ForwardTTSArgs = ForwardTTSArgs()\n\n    # multi-speaker settings\n    num_speakers: int = 0\n    speakers_file: str = None\n    use_speaker_embedding: bool = False\n    use_d_vector_file: bool = False\n    d_vector_file: str = False\n    d_vector_dim: int = 0\n\n    # optimizer parameters\n    optimizer: str = \"Adam\"\n    optimizer_params: dict = field(default_factory=lambda: {\"betas\": [0.9, 0.998], \"weight_decay\": 1e-6})\n    lr_scheduler: str = \"NoamLR\"\n    lr_scheduler_params: dict = field(default_factory=lambda: {\"warmup_steps\": 4000})\n    lr: float = 1e-4\n    grad_clip: float = 5.0\n\n    # loss params\n    spec_loss_type: str = \"mse\"\n    duration_loss_type: str = \"mse\"\n    use_ssim_loss: bool = True\n    ssim_loss_alpha: float = 1.0\n    spec_loss_alpha: float = 1.0\n    aligner_loss_alpha: float = 1.0\n    pitch_loss_alpha: float = 0.1\n    dur_loss_alpha: float = 0.1\n    binary_align_loss_alpha: float = 0.1\n    binary_loss_warmup_epochs: int = 150\n\n    # overrides\n    min_seq_len: int = 13\n    max_seq_len: int = 200\n    r: int = 1  # DO NOT CHANGE\n\n    # dataset configs\n    compute_f0: bool = True\n    f0_cache_path: str = None\n\n    # testing\n    test_sentences: List[str] = field(\n        default_factory=lambda: [\n            \"It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.\",\n            \"Be a voice, not an echo.\",\n            \"I'm sorry Dave. I'm afraid I can't do that.\",\n            \"This cake is great. It's so delicious and moist.\",\n            \"Prior to November 22, 1963.\",\n        ]\n    )\n\n    def __post_init__(self):\n        # Pass multi-speaker parameters to the model args as `model.init_multispeaker()` looks for it there.\n        if self.num_speakers > 0:\n            self.model_args.num_speakers = self.num_speakers\n\n        # speaker embedding settings\n        if self.use_speaker_embedding:\n            self.model_args.use_speaker_embedding = True\n        if self.speakers_file:\n            self.model_args.speakers_file = self.speakers_file\n\n        # d-vector settings\n        if self.use_d_vector_file:\n            self.model_args.use_d_vector_file = True\n        if self.d_vector_dim is not None and self.d_vector_dim > 0:\n            self.model_args.d_vector_dim = self.d_vector_dim\n        if self.d_vector_file:\n            self.model_args.d_vector_file = self.d_vector_file\n"
  },
  {
    "path": "TTS/tts/configs/fast_speech_config.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import List\n\nfrom TTS.tts.configs.shared_configs import BaseTTSConfig\nfrom TTS.tts.models.forward_tts import ForwardTTSArgs\n\n\n@dataclass\nclass FastSpeechConfig(BaseTTSConfig):\n    \"\"\"Configure `ForwardTTS` as FastSpeech model.\n\n    Example:\n\n        >>> from TTS.tts.configs.fast_speech_config import FastSpeechConfig\n        >>> config = FastSpeechConfig()\n\n    Args:\n        model (str):\n            Model name used for selecting the right model at initialization. Defaults to `fast_pitch`.\n\n        base_model (str):\n            Name of the base model being configured as this model so that 🐸 TTS knows it needs to initiate\n            the base model rather than searching for the `model` implementation. Defaults to `forward_tts`.\n\n        model_args (Coqpit):\n            Model class arguments. Check `FastSpeechArgs` for more details. Defaults to `FastSpeechArgs()`.\n\n        data_dep_init_steps (int):\n            Number of steps used for computing normalization parameters at the beginning of the training. GlowTTS uses\n            Activation Normalization that pre-computes normalization stats at the beginning and use the same values\n            for the rest. Defaults to 10.\n\n        speakers_file (str):\n            Path to the file containing the list of speakers. Needed at inference for loading matching speaker ids to\n            speaker names. Defaults to `None`.\n\n\n        use_speaker_embedding (bool):\n            enable / disable using speaker embeddings for multi-speaker models. If set True, the model is\n            in the multi-speaker mode. Defaults to False.\n\n        use_d_vector_file (bool):\n            enable /disable using external speaker embeddings in place of the learned embeddings. Defaults to False.\n\n        d_vector_file (str):\n            Path to the file including pre-computed speaker embeddings. Defaults to None.\n\n        d_vector_dim (int):\n            Dimension of the external speaker embeddings. Defaults to 0.\n\n        optimizer (str):\n            Name of the model optimizer. Defaults to `Adam`.\n\n        optimizer_params (dict):\n            Arguments of the model optimizer. Defaults to `{\"betas\": [0.9, 0.998], \"weight_decay\": 1e-6}`.\n\n        lr_scheduler (str):\n            Name of the learning rate scheduler. Defaults to `Noam`.\n\n        lr_scheduler_params (dict):\n            Arguments of the learning rate scheduler. Defaults to `{\"warmup_steps\": 4000}`.\n\n        lr (float):\n            Initial learning rate. Defaults to `1e-3`.\n\n        grad_clip (float):\n            Gradient norm clipping value. Defaults to `5.0`.\n\n        spec_loss_type (str):\n            Type of the spectrogram loss. Check `ForwardTTSLoss` for possible values. Defaults to `mse`.\n\n        duration_loss_type (str):\n            Type of the duration loss. Check `ForwardTTSLoss` for possible values. Defaults to `mse`.\n\n        use_ssim_loss (bool):\n            Enable/disable the use of SSIM (Structural Similarity) loss. Defaults to True.\n\n        wd (float):\n            Weight decay coefficient. Defaults to `1e-7`.\n\n        ssim_loss_alpha (float):\n            Weight for the SSIM loss. If set 0, disables the SSIM loss. Defaults to 1.0.\n\n        dur_loss_alpha (float):\n            Weight for the duration predictor's loss. If set 0, disables the huber loss. Defaults to 1.0.\n\n        spec_loss_alpha (float):\n            Weight for the L1 spectrogram loss. If set 0, disables the L1 loss. Defaults to 1.0.\n\n        pitch_loss_alpha (float):\n            Weight for the pitch predictor's loss. If set 0, disables the pitch predictor. Defaults to 1.0.\n\n        binary_loss_alpha (float):\n            Weight for the binary loss. If set 0, disables the binary loss. Defaults to 1.0.\n\n        binary_loss_warmup_epochs (float):\n            Number of epochs to gradually increase the binary loss impact. Defaults to 150.\n\n        min_seq_len (int):\n            Minimum input sequence length to be used at training.\n\n        max_seq_len (int):\n            Maximum input sequence length to be used at training. Larger values result in more VRAM usage.\n    \"\"\"\n\n    model: str = \"fast_speech\"\n    base_model: str = \"forward_tts\"\n\n    # model specific params\n    model_args: ForwardTTSArgs = ForwardTTSArgs(use_pitch=False)\n\n    # multi-speaker settings\n    num_speakers: int = 0\n    speakers_file: str = None\n    use_speaker_embedding: bool = False\n    use_d_vector_file: bool = False\n    d_vector_file: str = False\n    d_vector_dim: int = 0\n\n    # optimizer parameters\n    optimizer: str = \"Adam\"\n    optimizer_params: dict = field(default_factory=lambda: {\"betas\": [0.9, 0.998], \"weight_decay\": 1e-6})\n    lr_scheduler: str = \"NoamLR\"\n    lr_scheduler_params: dict = field(default_factory=lambda: {\"warmup_steps\": 4000})\n    lr: float = 1e-4\n    grad_clip: float = 5.0\n\n    # loss params\n    spec_loss_type: str = \"mse\"\n    duration_loss_type: str = \"mse\"\n    use_ssim_loss: bool = True\n    ssim_loss_alpha: float = 1.0\n    dur_loss_alpha: float = 1.0\n    spec_loss_alpha: float = 1.0\n    pitch_loss_alpha: float = 0.0\n    aligner_loss_alpha: float = 1.0\n    binary_align_loss_alpha: float = 1.0\n    binary_loss_warmup_epochs: int = 150\n\n    # overrides\n    min_seq_len: int = 13\n    max_seq_len: int = 200\n    r: int = 1  # DO NOT CHANGE\n\n    # dataset configs\n    compute_f0: bool = False\n    f0_cache_path: str = None\n\n    # testing\n    test_sentences: List[str] = field(\n        default_factory=lambda: [\n            \"It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.\",\n            \"Be a voice, not an echo.\",\n            \"I'm sorry Dave. I'm afraid I can't do that.\",\n            \"This cake is great. It's so delicious and moist.\",\n            \"Prior to November 22, 1963.\",\n        ]\n    )\n\n    def __post_init__(self):\n        # Pass multi-speaker parameters to the model args as `model.init_multispeaker()` looks for it there.\n        if self.num_speakers > 0:\n            self.model_args.num_speakers = self.num_speakers\n\n        # speaker embedding settings\n        if self.use_speaker_embedding:\n            self.model_args.use_speaker_embedding = True\n        if self.speakers_file:\n            self.model_args.speakers_file = self.speakers_file\n\n        # d-vector settings\n        if self.use_d_vector_file:\n            self.model_args.use_d_vector_file = True\n        if self.d_vector_dim is not None and self.d_vector_dim > 0:\n            self.model_args.d_vector_dim = self.d_vector_dim\n        if self.d_vector_file:\n            self.model_args.d_vector_file = self.d_vector_file\n"
  },
  {
    "path": "TTS/tts/configs/fastspeech2_config.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import List\n\nfrom TTS.tts.configs.shared_configs import BaseTTSConfig\nfrom TTS.tts.models.forward_tts import ForwardTTSArgs\n\n\n@dataclass\nclass Fastspeech2Config(BaseTTSConfig):\n    \"\"\"Configure `ForwardTTS` as FastPitch model.\n\n    Example:\n\n        >>> from TTS.tts.configs.fastspeech2_config import FastSpeech2Config\n        >>> config = FastSpeech2Config()\n\n    Args:\n        model (str):\n            Model name used for selecting the right model at initialization. Defaults to `fast_pitch`.\n\n        base_model (str):\n            Name of the base model being configured as this model so that 🐸 TTS knows it needs to initiate\n            the base model rather than searching for the `model` implementation. Defaults to `forward_tts`.\n\n        model_args (Coqpit):\n            Model class arguments. Check `FastPitchArgs` for more details. Defaults to `FastPitchArgs()`.\n\n        data_dep_init_steps (int):\n            Number of steps used for computing normalization parameters at the beginning of the training. GlowTTS uses\n            Activation Normalization that pre-computes normalization stats at the beginning and use the same values\n            for the rest. Defaults to 10.\n\n        speakers_file (str):\n            Path to the file containing the list of speakers. Needed at inference for loading matching speaker ids to\n            speaker names. Defaults to `None`.\n\n        use_speaker_embedding (bool):\n            enable / disable using speaker embeddings for multi-speaker models. If set True, the model is\n            in the multi-speaker mode. Defaults to False.\n\n        use_d_vector_file (bool):\n            enable /disable using external speaker embeddings in place of the learned embeddings. Defaults to False.\n\n        d_vector_file (str):\n            Path to the file including pre-computed speaker embeddings. Defaults to None.\n\n        d_vector_dim (int):\n            Dimension of the external speaker embeddings. Defaults to 0.\n\n        optimizer (str):\n            Name of the model optimizer. Defaults to `Adam`.\n\n        optimizer_params (dict):\n            Arguments of the model optimizer. Defaults to `{\"betas\": [0.9, 0.998], \"weight_decay\": 1e-6}`.\n\n        lr_scheduler (str):\n            Name of the learning rate scheduler. Defaults to `Noam`.\n\n        lr_scheduler_params (dict):\n            Arguments of the learning rate scheduler. Defaults to `{\"warmup_steps\": 4000}`.\n\n        lr (float):\n            Initial learning rate. Defaults to `1e-3`.\n\n        grad_clip (float):\n            Gradient norm clipping value. Defaults to `5.0`.\n\n        spec_loss_type (str):\n            Type of the spectrogram loss. Check `ForwardTTSLoss` for possible values. Defaults to `mse`.\n\n        duration_loss_type (str):\n            Type of the duration loss. Check `ForwardTTSLoss` for possible values. Defaults to `mse`.\n\n        use_ssim_loss (bool):\n            Enable/disable the use of SSIM (Structural Similarity) loss. Defaults to True.\n\n        wd (float):\n            Weight decay coefficient. Defaults to `1e-7`.\n\n        ssim_loss_alpha (float):\n            Weight for the SSIM loss. If set 0, disables the SSIM loss. Defaults to 1.0.\n\n        dur_loss_alpha (float):\n            Weight for the duration predictor's loss. If set 0, disables the huber loss. Defaults to 1.0.\n\n        spec_loss_alpha (float):\n            Weight for the L1 spectrogram loss. If set 0, disables the L1 loss. Defaults to 1.0.\n\n        pitch_loss_alpha (float):\n            Weight for the pitch predictor's loss. If set 0, disables the pitch predictor. Defaults to 1.0.\n\n        energy_loss_alpha (float):\n            Weight for the energy predictor's loss. If set 0, disables the energy predictor. Defaults to 1.0.\n\n        binary_align_loss_alpha (float):\n            Weight for the binary loss. If set 0, disables the binary loss. Defaults to 1.0.\n\n        binary_loss_warmup_epochs (float):\n            Number of epochs to gradually increase the binary loss impact. Defaults to 150.\n\n        min_seq_len (int):\n            Minimum input sequence length to be used at training.\n\n        max_seq_len (int):\n            Maximum input sequence length to be used at training. Larger values result in more VRAM usage.\n\n        # dataset configs\n        compute_f0(bool):\n            Compute pitch. defaults to True\n\n        f0_cache_path(str):\n            pith cache path. defaults to None\n\n        # dataset configs\n        compute_energy(bool):\n            Compute energy. defaults to True\n\n        energy_cache_path(str):\n            energy cache path. defaults to None\n    \"\"\"\n\n    model: str = \"fastspeech2\"\n    base_model: str = \"forward_tts\"\n\n    # model specific params\n    model_args: ForwardTTSArgs = ForwardTTSArgs(use_pitch=True, use_energy=True)\n\n    # multi-speaker settings\n    num_speakers: int = 0\n    speakers_file: str = None\n    use_speaker_embedding: bool = False\n    use_d_vector_file: bool = False\n    d_vector_file: str = False\n    d_vector_dim: int = 0\n\n    # optimizer parameters\n    optimizer: str = \"Adam\"\n    optimizer_params: dict = field(default_factory=lambda: {\"betas\": [0.9, 0.998], \"weight_decay\": 1e-6})\n    lr_scheduler: str = \"NoamLR\"\n    lr_scheduler_params: dict = field(default_factory=lambda: {\"warmup_steps\": 4000})\n    lr: float = 1e-4\n    grad_clip: float = 5.0\n\n    # loss params\n    spec_loss_type: str = \"mse\"\n    duration_loss_type: str = \"mse\"\n    use_ssim_loss: bool = True\n    ssim_loss_alpha: float = 1.0\n    spec_loss_alpha: float = 1.0\n    aligner_loss_alpha: float = 1.0\n    pitch_loss_alpha: float = 0.1\n    energy_loss_alpha: float = 0.1\n    dur_loss_alpha: float = 0.1\n    binary_align_loss_alpha: float = 0.1\n    binary_loss_warmup_epochs: int = 150\n\n    # overrides\n    min_seq_len: int = 13\n    max_seq_len: int = 200\n    r: int = 1  # DO NOT CHANGE\n\n    # dataset configs\n    compute_f0: bool = True\n    f0_cache_path: str = None\n\n    # dataset configs\n    compute_energy: bool = True\n    energy_cache_path: str = None\n\n    # testing\n    test_sentences: List[str] = field(\n        default_factory=lambda: [\n            \"It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.\",\n            \"Be a voice, not an echo.\",\n            \"I'm sorry Dave. I'm afraid I can't do that.\",\n            \"This cake is great. It's so delicious and moist.\",\n            \"Prior to November 22, 1963.\",\n        ]\n    )\n\n    def __post_init__(self):\n        # Pass multi-speaker parameters to the model args as `model.init_multispeaker()` looks for it there.\n        if self.num_speakers > 0:\n            self.model_args.num_speakers = self.num_speakers\n\n        # speaker embedding settings\n        if self.use_speaker_embedding:\n            self.model_args.use_speaker_embedding = True\n        if self.speakers_file:\n            self.model_args.speakers_file = self.speakers_file\n\n        # d-vector settings\n        if self.use_d_vector_file:\n            self.model_args.use_d_vector_file = True\n        if self.d_vector_dim is not None and self.d_vector_dim > 0:\n            self.model_args.d_vector_dim = self.d_vector_dim\n        if self.d_vector_file:\n            self.model_args.d_vector_file = self.d_vector_file\n"
  },
  {
    "path": "TTS/tts/configs/glow_tts_config.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import List\n\nfrom TTS.tts.configs.shared_configs import BaseTTSConfig\n\n\n@dataclass\nclass GlowTTSConfig(BaseTTSConfig):\n    \"\"\"Defines parameters for GlowTTS model.\n\n    Example:\n\n        >>> from TTS.tts.configs.glow_tts_config import GlowTTSConfig\n        >>> config = GlowTTSConfig()\n\n    Args:\n        model(str):\n            Model name used for selecting the right model at initialization. Defaults to `glow_tts`.\n        encoder_type (str):\n            Type of the encoder used by the model. Look at `TTS.tts.layers.glow_tts.encoder` for more details.\n            Defaults to `rel_pos_transformers`.\n        encoder_params (dict):\n            Parameters used to define the encoder network. Look at `TTS.tts.layers.glow_tts.encoder` for more details.\n            Defaults to `{\"kernel_size\": 3, \"dropout_p\": 0.1, \"num_layers\": 6, \"num_heads\": 2, \"hidden_channels_ffn\": 768}`\n        use_encoder_prenet (bool):\n            enable / disable the use of a prenet for the encoder. Defaults to True.\n        hidden_channels_enc (int):\n            Number of base hidden channels used by the encoder network. It defines the input and the output channel sizes,\n            and for some encoder types internal hidden channels sizes too. Defaults to 192.\n        hidden_channels_dec (int):\n            Number of base hidden channels used by the decoder WaveNet network. Defaults to 192 as in the original work.\n        hidden_channels_dp (int):\n            Number of layer channels of the duration predictor network. Defaults to 256 as in the original work.\n        mean_only (bool):\n            If true predict only the mean values by the decoder flow. Defaults to True.\n        out_channels (int):\n            Number of channels of the model output tensor. Defaults to 80.\n        num_flow_blocks_dec (int):\n            Number of decoder blocks. Defaults to 12.\n        inference_noise_scale (float):\n            Noise scale used at inference. Defaults to 0.33.\n        kernel_size_dec (int):\n            Decoder kernel size. Defaults to 5\n        dilation_rate (int):\n            Rate to increase dilation by each layer in a decoder block. Defaults to 1.\n        num_block_layers (int):\n            Number of decoder layers in each decoder block.  Defaults to 4.\n        dropout_p_dec (float):\n            Dropout rate for decoder. Defaults to 0.1.\n        num_speaker (int):\n            Number of speaker to define the size of speaker embedding layer. Defaults to 0.\n        c_in_channels (int):\n            Number of speaker embedding channels. It is set to 512 if embeddings are learned. Defaults to 0.\n        num_splits (int):\n            Number of split levels in inversible conv1x1 operation. Defaults to 4.\n        num_squeeze (int):\n            Number of squeeze levels. When squeezing channels increases and time steps reduces by the factor\n            'num_squeeze'. Defaults to 2.\n        sigmoid_scale (bool):\n            enable/disable sigmoid scaling in decoder. Defaults to False.\n        mean_only (bool):\n            If True, encoder only computes mean value and uses constant variance for each time step. Defaults to true.\n        encoder_type (str):\n            Encoder module type. Possible values are`[\"rel_pos_transformer\", \"gated_conv\", \"residual_conv_bn\", \"time_depth_separable\"]`\n            Check `TTS.tts.layers.glow_tts.encoder` for more details. Defaults to `rel_pos_transformers` as in the original paper.\n        encoder_params (dict):\n            Encoder module parameters. Defaults to None.\n        d_vector_dim (int):\n            Channels of external speaker embedding vectors. Defaults to 0.\n        data_dep_init_steps (int):\n            Number of steps used for computing normalization parameters at the beginning of the training. GlowTTS uses\n            Activation Normalization that pre-computes normalization stats at the beginning and use the same values\n            for the rest. Defaults to 10.\n        style_wav_for_test (str):\n            Path to the wav file used for changing the style of the speech. Defaults to None.\n        inference_noise_scale (float):\n            Variance used for sampling the random noise added to the decoder's input at inference. Defaults to 0.0.\n        length_scale (float):\n            Multiply the predicted durations with this value to change the speech speed. Defaults to 1.\n        use_speaker_embedding (bool):\n            enable / disable using speaker embeddings for multi-speaker models. If set True, the model is\n            in the multi-speaker mode. Defaults to False.\n        use_d_vector_file (bool):\n            enable /disable using external speaker embeddings in place of the learned embeddings. Defaults to False.\n        d_vector_file (str):\n            Path to the file including pre-computed speaker embeddings. Defaults to None.\n        noam_schedule (bool):\n            enable / disable the use of Noam LR scheduler. Defaults to False.\n        warmup_steps (int):\n            Number of warm-up steps for the Noam scheduler. Defaults 4000.\n        lr (float):\n            Initial learning rate. Defaults to `1e-3`.\n        wd (float):\n            Weight decay coefficient. Defaults to `1e-7`.\n        min_seq_len (int):\n            Minimum input sequence length to be used at training.\n        max_seq_len (int):\n            Maximum input sequence length to be used at training. Larger values result in more VRAM usage.\n    \"\"\"\n\n    model: str = \"glow_tts\"\n\n    # model params\n    num_chars: int = None\n    encoder_type: str = \"rel_pos_transformer\"\n    encoder_params: dict = field(\n        default_factory=lambda: {\n            \"kernel_size\": 3,\n            \"dropout_p\": 0.1,\n            \"num_layers\": 6,\n            \"num_heads\": 2,\n            \"hidden_channels_ffn\": 768,\n        }\n    )\n    use_encoder_prenet: bool = True\n    hidden_channels_enc: int = 192\n    hidden_channels_dec: int = 192\n    hidden_channels_dp: int = 256\n    dropout_p_dp: float = 0.1\n    dropout_p_dec: float = 0.05\n    mean_only: bool = True\n    out_channels: int = 80\n    num_flow_blocks_dec: int = 12\n    inference_noise_scale: float = 0.33\n    kernel_size_dec: int = 5\n    dilation_rate: int = 1\n    num_block_layers: int = 4\n    num_speakers: int = 0\n    c_in_channels: int = 0\n    num_splits: int = 4\n    num_squeeze: int = 2\n    sigmoid_scale: bool = False\n    encoder_type: str = \"rel_pos_transformer\"\n    encoder_params: dict = field(\n        default_factory=lambda: {\n            \"kernel_size\": 3,\n            \"dropout_p\": 0.1,\n            \"num_layers\": 6,\n            \"num_heads\": 2,\n            \"hidden_channels_ffn\": 768,\n            \"input_length\": None,\n        }\n    )\n    d_vector_dim: int = 0\n\n    # training params\n    data_dep_init_steps: int = 10\n\n    # inference params\n    style_wav_for_test: str = None\n    inference_noise_scale: float = 0.0\n    length_scale: float = 1.0\n\n    # multi-speaker settings\n    use_speaker_embedding: bool = False\n    speakers_file: str = None\n    use_d_vector_file: bool = False\n    d_vector_file: str = False\n\n    # optimizer parameters\n    optimizer: str = \"RAdam\"\n    optimizer_params: dict = field(default_factory=lambda: {\"betas\": [0.9, 0.998], \"weight_decay\": 1e-6})\n    lr_scheduler: str = \"NoamLR\"\n    lr_scheduler_params: dict = field(default_factory=lambda: {\"warmup_steps\": 4000})\n    grad_clip: float = 5.0\n    lr: float = 1e-3\n\n    # overrides\n    min_seq_len: int = 3\n    max_seq_len: int = 500\n    r: int = 1  # DO NOT CHANGE - TODO: make this immutable once coqpit implements it.\n\n    # testing\n    test_sentences: List[str] = field(\n        default_factory=lambda: [\n            \"It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.\",\n            \"Be a voice, not an echo.\",\n            \"I'm sorry Dave. I'm afraid I can't do that.\",\n            \"This cake is great. It's so delicious and moist.\",\n            \"Prior to November 22, 1963.\",\n        ]\n    )\n"
  },
  {
    "path": "TTS/tts/configs/neuralhmm_tts_config.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import List\n\nfrom TTS.tts.configs.shared_configs import BaseTTSConfig\n\n\n@dataclass\nclass NeuralhmmTTSConfig(BaseTTSConfig):\n    \"\"\"\n    Define parameters for Neural HMM TTS model.\n\n    Example:\n\n        >>> from TTS.tts.configs.overflow_config import OverflowConfig\n        >>> config = OverflowConfig()\n\n    Args:\n        model (str):\n            Model name used to select the right model class to initilize. Defaults to `Overflow`.\n        run_eval_steps (int):\n            Run evalulation epoch after N steps. If None, waits until training epoch is completed. Defaults to None.\n        save_step (int):\n            Save local checkpoint every save_step steps. Defaults to 500.\n        plot_step (int):\n            Plot training stats on the logger every plot_step steps. Defaults to 1.\n        model_param_stats (bool):\n            Log model parameters stats on the logger dashboard. Defaults to False.\n        force_generate_statistics (bool):\n            Force generate mel normalization statistics. Defaults to False.\n        mel_statistics_parameter_path (str):\n            Path to the mel normalization statistics.If the model doesn't finds a file there it will generate statistics.\n            Defaults to None.\n        num_chars (int):\n            Number of characters used by the model. It must be defined before initializing the model. Defaults to None.\n        state_per_phone (int):\n            Generates N states per phone. Similar, to `add_blank` parameter in GlowTTS but in Overflow it is upsampled by model's encoder. Defaults to 2.\n        encoder_in_out_features (int):\n            Channels of encoder input and character embedding tensors. Defaults to 512.\n        encoder_n_convolutions (int):\n            Number of convolution layers in the encoder. Defaults to 3.\n        out_channels (int):\n            Channels of the final model output. It must match the spectragram size. Defaults to 80.\n        ar_order (int):\n            Autoregressive order of the model. Defaults to 1. In ablations of Neural HMM it was found that more autoregression while giving more variation hurts naturalness of the synthesised audio.\n        sampling_temp (float):\n            Variation added to the sample from the latent space of neural HMM. Defaults to 0.334.\n        deterministic_transition (bool):\n            deterministic duration generation based on duration quantiles as defiend in \"S. Ronanki, O. Watts, S. King, and G. E. Henter, “Medianbased generation of synthetic speech durations using a nonparametric approach,” in Proc. SLT, 2016.\". Defaults to True.\n        duration_threshold (float):\n            Threshold for duration quantiles. Defaults to 0.55. Tune this to change the speaking rate of the synthesis, where lower values defines a slower speaking rate and higher values defines a faster speaking rate.\n        use_grad_checkpointing (bool):\n            Use gradient checkpointing to save memory. In a multi-GPU setting currently pytorch does not supports gradient checkpoint inside a loop so we will have to turn it off then.Adjust depending on whatever get more batch size either by using a single GPU or multi-GPU. Defaults to True.\n        max_sampling_time (int):\n            Maximum sampling time while synthesising latents from neural HMM. Defaults to 1000.\n        prenet_type (str):\n            `original` or `bn`. `original` sets the default Prenet and `bn` uses Batch Normalization version of the\n            Prenet. Defaults to `original`.\n        prenet_dim (int):\n            Dimension of the Prenet. Defaults to 256.\n        prenet_n_layers (int):\n            Number of layers in the Prenet. Defaults to 2.\n        prenet_dropout (float):\n            Dropout rate of the Prenet. Defaults to 0.5.\n        prenet_dropout_at_inference (bool):\n            Use dropout at inference time. Defaults to False.\n        memory_rnn_dim (int):\n            Dimension of the memory LSTM to process the prenet output. Defaults to 1024.\n        outputnet_size (list[int]):\n            Size of the output network inside the neural HMM. Defaults to [1024].\n        flat_start_params (dict):\n            Parameters for the flat start initialization of the neural HMM. Defaults to `{\"mean\": 0.0, \"std\": 1.0, \"transition_p\": 0.14}`.\n            It will be recomputed when you pass the dataset.\n        std_floor (float):\n            Floor value for the standard deviation of the neural HMM. Prevents model cheating by putting point mass and getting infinite likelihood at any datapoint. Defaults to 0.01.\n            It is called `variance flooring` in standard HMM literature.\n        optimizer (str):\n            Optimizer to use for training. Defaults to `adam`.\n        optimizer_params (dict):\n            Parameters for the optimizer. Defaults to `{\"weight_decay\": 1e-6}`.\n        grad_clip (float):\n            Gradient clipping threshold. Defaults to 40_000.\n        lr (float):\n            Learning rate. Defaults to 1e-3.\n        lr_scheduler (str):\n            Learning rate scheduler for the training. Use one from `torch.optim.Scheduler` schedulers or\n            `TTS.utils.training`. Defaults to `None`.\n        min_seq_len (int):\n            Minimum input sequence length to be used at training.\n        max_seq_len (int):\n            Maximum input sequence length to be used at training. Larger values result in more VRAM usage.\n    \"\"\"\n\n    model: str = \"NeuralHMM_TTS\"\n\n    # Training and Checkpoint configs\n    run_eval_steps: int = 100\n    save_step: int = 500\n    plot_step: int = 1\n    model_param_stats: bool = False\n\n    # data parameters\n    force_generate_statistics: bool = False\n    mel_statistics_parameter_path: str = None\n\n    # Encoder parameters\n    num_chars: int = None\n    state_per_phone: int = 2\n    encoder_in_out_features: int = 512\n    encoder_n_convolutions: int = 3\n\n    # HMM parameters\n    out_channels: int = 80\n    ar_order: int = 1\n    sampling_temp: float = 0\n    deterministic_transition: bool = True\n    duration_threshold: float = 0.43\n    use_grad_checkpointing: bool = True\n    max_sampling_time: int = 1000\n\n    ## Prenet parameters\n    prenet_type: str = \"original\"\n    prenet_dim: int = 256\n    prenet_n_layers: int = 2\n    prenet_dropout: float = 0.5\n    prenet_dropout_at_inference: bool = True\n    memory_rnn_dim: int = 1024\n\n    ## Outputnet parameters\n    outputnet_size: List[int] = field(default_factory=lambda: [1024])\n    flat_start_params: dict = field(default_factory=lambda: {\"mean\": 0.0, \"std\": 1.0, \"transition_p\": 0.14})\n    std_floor: float = 0.001\n\n    # optimizer parameters\n    optimizer: str = \"Adam\"\n    optimizer_params: dict = field(default_factory=lambda: {\"weight_decay\": 1e-6})\n    grad_clip: float = 40000.0\n    lr: float = 1e-3\n    lr_scheduler: str = None\n\n    # overrides\n    min_text_len: int = 10\n    max_text_len: int = 500\n    min_audio_len: int = 512\n\n    # testing\n    test_sentences: List[str] = field(\n        default_factory=lambda: [\n            \"Be a voice, not an echo.\",\n        ]\n    )\n\n    # Extra needed config\n    r: int = 1\n    use_d_vector_file: bool = False\n    use_speaker_embedding: bool = False\n\n    def check_values(self):\n        \"\"\"Validate the hyperparameters.\n\n        Raises:\n            AssertionError: when the parameters network is not defined\n            AssertionError: transition probability is not between 0 and 1\n        \"\"\"\n        assert self.ar_order > 0, \"AR order must be greater than 0 it is an autoregressive model.\"\n        assert (\n            len(self.outputnet_size) >= 1\n        ), f\"Parameter Network must have atleast one layer check the config file for parameter network. Provided: {self.parameternetwork}\"\n        assert (\n            0 < self.flat_start_params[\"transition_p\"] < 1\n        ), f\"Transition probability must be between 0 and 1. Provided: {self.flat_start_params['transition_p']}\"\n"
  },
  {
    "path": "TTS/tts/configs/overflow_config.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import List\n\nfrom TTS.tts.configs.shared_configs import BaseTTSConfig\n\n\n@dataclass\nclass OverflowConfig(BaseTTSConfig):  # The classname has to be camel case\n    \"\"\"\n    Define parameters for OverFlow model.\n\n    Example:\n\n        >>> from TTS.tts.configs.overflow_config import OverflowConfig\n        >>> config = OverflowConfig()\n\n    Args:\n        model (str):\n            Model name used to select the right model class to initilize. Defaults to `Overflow`.\n        run_eval_steps (int):\n            Run evalulation epoch after N steps. If None, waits until training epoch is completed. Defaults to None.\n        save_step (int):\n            Save local checkpoint every save_step steps. Defaults to 500.\n        plot_step (int):\n            Plot training stats on the logger every plot_step steps. Defaults to 1.\n        model_param_stats (bool):\n            Log model parameters stats on the logger dashboard. Defaults to False.\n        force_generate_statistics (bool):\n            Force generate mel normalization statistics. Defaults to False.\n        mel_statistics_parameter_path (str):\n            Path to the mel normalization statistics.If the model doesn't finds a file there it will generate statistics.\n            Defaults to None.\n        num_chars (int):\n            Number of characters used by the model. It must be defined before initializing the model. Defaults to None.\n        state_per_phone (int):\n            Generates N states per phone. Similar, to `add_blank` parameter in GlowTTS but in Overflow it is upsampled by model's encoder. Defaults to 2.\n        encoder_in_out_features (int):\n            Channels of encoder input and character embedding tensors. Defaults to 512.\n        encoder_n_convolutions (int):\n            Number of convolution layers in the encoder. Defaults to 3.\n        out_channels (int):\n            Channels of the final model output. It must match the spectragram size. Defaults to 80.\n        ar_order (int):\n            Autoregressive order of the model. Defaults to 1. In ablations of Neural HMM it was found that more autoregression while giving more variation hurts naturalness of the synthesised audio.\n        sampling_temp (float):\n            Variation added to the sample from the latent space of neural HMM. Defaults to 0.334.\n        deterministic_transition (bool):\n            deterministic duration generation based on duration quantiles as defiend in \"S. Ronanki, O. Watts, S. King, and G. E. Henter, “Medianbased generation of synthetic speech durations using a nonparametric approach,” in Proc. SLT, 2016.\". Defaults to True.\n        duration_threshold (float):\n            Threshold for duration quantiles. Defaults to 0.55. Tune this to change the speaking rate of the synthesis, where lower values defines a slower speaking rate and higher values defines a faster speaking rate.\n        use_grad_checkpointing (bool):\n            Use gradient checkpointing to save memory. In a multi-GPU setting currently pytorch does not supports gradient checkpoint inside a loop so we will have to turn it off then.Adjust depending on whatever get more batch size either by using a single GPU or multi-GPU. Defaults to True.\n        max_sampling_time (int):\n            Maximum sampling time while synthesising latents from neural HMM. Defaults to 1000.\n        prenet_type (str):\n            `original` or `bn`. `original` sets the default Prenet and `bn` uses Batch Normalization version of the\n            Prenet. Defaults to `original`.\n        prenet_dim (int):\n            Dimension of the Prenet. Defaults to 256.\n        prenet_n_layers (int):\n            Number of layers in the Prenet. Defaults to 2.\n        prenet_dropout (float):\n            Dropout rate of the Prenet. Defaults to 0.5.\n        prenet_dropout_at_inference (bool):\n            Use dropout at inference time. Defaults to False.\n        memory_rnn_dim (int):\n            Dimension of the memory LSTM to process the prenet output. Defaults to 1024.\n        outputnet_size (list[int]):\n            Size of the output network inside the neural HMM. Defaults to [1024].\n        flat_start_params (dict):\n            Parameters for the flat start initialization of the neural HMM. Defaults to `{\"mean\": 0.0, \"std\": 1.0, \"transition_p\": 0.14}`.\n            It will be recomputed when you pass the dataset.\n        std_floor (float):\n            Floor value for the standard deviation of the neural HMM. Prevents model cheating by putting point mass and getting infinite likelihood at any datapoint. Defaults to 0.01.\n            It is called `variance flooring` in standard HMM literature.\n        hidden_channels_dec (int):\n            Number of base hidden channels used by the decoder WaveNet network. Defaults to 150.\n        kernel_size_dec (int):\n            Decoder kernel size. Defaults to 5\n        dilation_rate (int):\n            Rate to increase dilation by each layer in a decoder block. Defaults to 1.\n        num_flow_blocks_dec (int):\n            Number of decoder layers in each decoder block.  Defaults to 4.\n        dropout_p_dec (float):\n            Dropout rate of the decoder. Defaults to 0.05.\n        num_splits (int):\n            Number of split levels in inversible conv1x1 operation. Defaults to 4.\n        num_squeeze (int):\n            Number of squeeze levels. When squeezing channels increases and time steps reduces by the factor\n            'num_squeeze'. Defaults to 2.\n        sigmoid_scale (bool):\n            enable/disable sigmoid scaling in decoder. Defaults to False.\n        c_in_channels (int):\n            Unused parameter from GlowTTS's decoder. Defaults to 0.\n        optimizer (str):\n            Optimizer to use for training. Defaults to `adam`.\n        optimizer_params (dict):\n            Parameters for the optimizer. Defaults to `{\"weight_decay\": 1e-6}`.\n        grad_clip (float):\n            Gradient clipping threshold. Defaults to 40_000.\n        lr (float):\n            Learning rate. Defaults to 1e-3.\n        lr_scheduler (str):\n            Learning rate scheduler for the training. Use one from `torch.optim.Scheduler` schedulers or\n            `TTS.utils.training`. Defaults to `None`.\n        min_seq_len (int):\n            Minimum input sequence length to be used at training.\n        max_seq_len (int):\n            Maximum input sequence length to be used at training. Larger values result in more VRAM usage.\n    \"\"\"\n\n    model: str = \"Overflow\"\n\n    # Training and Checkpoint configs\n    run_eval_steps: int = 100\n    save_step: int = 500\n    plot_step: int = 1\n    model_param_stats: bool = False\n\n    # data parameters\n    force_generate_statistics: bool = False\n    mel_statistics_parameter_path: str = None\n\n    # Encoder parameters\n    num_chars: int = None\n    state_per_phone: int = 2\n    encoder_in_out_features: int = 512\n    encoder_n_convolutions: int = 3\n\n    # HMM parameters\n    out_channels: int = 80\n    ar_order: int = 1\n    sampling_temp: float = 0.334\n    deterministic_transition: bool = True\n    duration_threshold: float = 0.55\n    use_grad_checkpointing: bool = True\n    max_sampling_time: int = 1000\n\n    ## Prenet parameters\n    prenet_type: str = \"original\"\n    prenet_dim: int = 256\n    prenet_n_layers: int = 2\n    prenet_dropout: float = 0.5\n    prenet_dropout_at_inference: bool = False\n    memory_rnn_dim: int = 1024\n\n    ## Outputnet parameters\n    outputnet_size: List[int] = field(default_factory=lambda: [1024])\n    flat_start_params: dict = field(default_factory=lambda: {\"mean\": 0.0, \"std\": 1.0, \"transition_p\": 0.14})\n    std_floor: float = 0.01\n\n    # Decoder parameters\n    hidden_channels_dec: int = 150\n    kernel_size_dec: int = 5\n    dilation_rate: int = 1\n    num_flow_blocks_dec: int = 12\n    num_block_layers: int = 4\n    dropout_p_dec: float = 0.05\n    num_splits: int = 4\n    num_squeeze: int = 2\n    sigmoid_scale: bool = False\n    c_in_channels: int = 0\n\n    # optimizer parameters\n    optimizer: str = \"Adam\"\n    optimizer_params: dict = field(default_factory=lambda: {\"weight_decay\": 1e-6})\n    grad_clip: float = 40000.0\n    lr: float = 1e-3\n    lr_scheduler: str = None\n\n    # overrides\n    min_text_len: int = 10\n    max_text_len: int = 500\n    min_audio_len: int = 512\n\n    # testing\n    test_sentences: List[str] = field(\n        default_factory=lambda: [\n            \"Be a voice, not an echo.\",\n        ]\n    )\n\n    # Extra needed config\n    r: int = 1\n    use_d_vector_file: bool = False\n    use_speaker_embedding: bool = False\n\n    def check_values(self):\n        \"\"\"Validate the hyperparameters.\n\n        Raises:\n            AssertionError: when the parameters network is not defined\n            AssertionError: transition probability is not between 0 and 1\n        \"\"\"\n        assert self.ar_order > 0, \"AR order must be greater than 0 it is an autoregressive model.\"\n        assert (\n            len(self.outputnet_size) >= 1\n        ), f\"Parameter Network must have atleast one layer check the config file for parameter network. Provided: {self.parameternetwork}\"\n        assert (\n            0 < self.flat_start_params[\"transition_p\"] < 1\n        ), f\"Transition probability must be between 0 and 1. Provided: {self.flat_start_params['transition_p']}\"\n"
  },
  {
    "path": "TTS/tts/configs/shared_configs.py",
    "content": "from dataclasses import asdict, dataclass, field\nfrom typing import Dict, List\n\nfrom coqpit import Coqpit, check_argument\n\nfrom TTS.config import BaseAudioConfig, BaseDatasetConfig, BaseTrainingConfig\n\n\n@dataclass\nclass GSTConfig(Coqpit):\n    \"\"\"Defines the Global Style Token Module\n\n    Args:\n        gst_style_input_wav (str):\n            Path to the wav file used to define the style of the output speech at inference. Defaults to None.\n\n        gst_style_input_weights (dict):\n            Defines the weights for each style token used at inference. Defaults to None.\n\n        gst_embedding_dim (int):\n            Defines the size of the GST embedding vector dimensions. Defaults to 256.\n\n        gst_num_heads (int):\n            Number of attention heads used by the multi-head attention. Defaults to 4.\n\n        gst_num_style_tokens (int):\n            Number of style token vectors. Defaults to 10.\n    \"\"\"\n\n    gst_style_input_wav: str = None\n    gst_style_input_weights: dict = None\n    gst_embedding_dim: int = 256\n    gst_use_speaker_embedding: bool = False\n    gst_num_heads: int = 4\n    gst_num_style_tokens: int = 10\n\n    def check_values(\n        self,\n    ):\n        \"\"\"Check config fields\"\"\"\n        c = asdict(self)\n        super().check_values()\n        check_argument(\"gst_style_input_weights\", c, restricted=False)\n        check_argument(\"gst_style_input_wav\", c, restricted=False)\n        check_argument(\"gst_embedding_dim\", c, restricted=True, min_val=0, max_val=1000)\n        check_argument(\"gst_use_speaker_embedding\", c, restricted=False)\n        check_argument(\"gst_num_heads\", c, restricted=True, min_val=2, max_val=10)\n        check_argument(\"gst_num_style_tokens\", c, restricted=True, min_val=1, max_val=1000)\n\n\n@dataclass\nclass CapacitronVAEConfig(Coqpit):\n    \"\"\"Defines the capacitron VAE Module\n    Args:\n        capacitron_capacity (int):\n            Defines the variational capacity limit of the prosody embeddings. Defaults to 150.\n        capacitron_VAE_embedding_dim (int):\n            Defines the size of the Capacitron embedding vector dimension. Defaults to 128.\n        capacitron_use_text_summary_embeddings (bool):\n            If True, use a text summary embedding in Capacitron. Defaults to True.\n        capacitron_text_summary_embedding_dim (int):\n            Defines the size of the capacitron text embedding vector dimension. Defaults to 128.\n        capacitron_use_speaker_embedding (bool):\n            if True use speaker embeddings in Capacitron. Defaults to False.\n        capacitron_VAE_loss_alpha (float):\n            Weight for the VAE loss of the Tacotron model. If set less than or equal to zero, it disables the\n            corresponding loss function. Defaults to 0.25\n        capacitron_grad_clip (float):\n            Gradient clipping value for all gradients except beta. Defaults to 5.0\n    \"\"\"\n\n    capacitron_loss_alpha: int = 1\n    capacitron_capacity: int = 150\n    capacitron_VAE_embedding_dim: int = 128\n    capacitron_use_text_summary_embeddings: bool = True\n    capacitron_text_summary_embedding_dim: int = 128\n    capacitron_use_speaker_embedding: bool = False\n    capacitron_VAE_loss_alpha: float = 0.25\n    capacitron_grad_clip: float = 5.0\n\n    def check_values(\n        self,\n    ):\n        \"\"\"Check config fields\"\"\"\n        c = asdict(self)\n        super().check_values()\n        check_argument(\"capacitron_capacity\", c, restricted=True, min_val=10, max_val=500)\n        check_argument(\"capacitron_VAE_embedding_dim\", c, restricted=True, min_val=16, max_val=1024)\n        check_argument(\"capacitron_use_speaker_embedding\", c, restricted=False)\n        check_argument(\"capacitron_text_summary_embedding_dim\", c, restricted=False, min_val=16, max_val=512)\n        check_argument(\"capacitron_VAE_loss_alpha\", c, restricted=False)\n        check_argument(\"capacitron_grad_clip\", c, restricted=False)\n\n\n@dataclass\nclass CharactersConfig(Coqpit):\n    \"\"\"Defines arguments for the `BaseCharacters` or `BaseVocabulary` and their subclasses.\n\n    Args:\n        characters_class (str):\n            Defines the class of the characters used. If None, we pick ```Phonemes``` or ```Graphemes``` based on\n            the configuration. Defaults to None.\n\n        vocab_dict (dict):\n            Defines the vocabulary dictionary used to encode the characters. Defaults to None.\n\n        pad (str):\n            characters in place of empty padding. Defaults to None.\n\n        eos (str):\n            characters showing the end of a sentence. Defaults to None.\n\n        bos (str):\n            characters showing the beginning of a sentence. Defaults to None.\n\n        blank (str):\n            Optional character used between characters by some models for better prosody. Defaults to `_blank`.\n\n        characters (str):\n            character set used by the model. Characters not in this list are ignored when converting input text to\n            a list of sequence IDs. Defaults to None.\n\n        punctuations (str):\n            characters considered as punctuation as parsing the input sentence. Defaults to None.\n\n        phonemes (str):\n            characters considered as parsing phonemes. This is only for backwards compat. Use `characters` for new\n            models. Defaults to None.\n\n        is_unique (bool):\n            remove any duplicate characters in the character lists. It is a bandaid for compatibility with the old\n            models trained with character lists with duplicates. Defaults to True.\n\n        is_sorted (bool):\n            Sort the characters in alphabetical order. Defaults to True.\n    \"\"\"\n\n    characters_class: str = None\n\n    # using BaseVocabulary\n    vocab_dict: Dict = None\n\n    # using on BaseCharacters\n    pad: str = None\n    eos: str = None\n    bos: str = None\n    blank: str = None\n    characters: str = None\n    punctuations: str = None\n    phonemes: str = None\n    is_unique: bool = True  # for backwards compatibility of models trained with char sets with duplicates\n    is_sorted: bool = True\n\n\n@dataclass\nclass BaseTTSConfig(BaseTrainingConfig):\n    \"\"\"Shared parameters among all the tts models.\n\n    Args:\n\n        audio (BaseAudioConfig):\n            Audio processor config object instance.\n\n        use_phonemes (bool):\n            enable / disable phoneme use.\n\n        phonemizer (str):\n            Name of the phonemizer to use. If set None, the phonemizer will be selected by `phoneme_language`.\n            Defaults to None.\n\n        phoneme_language (str):\n            Language code for the phonemizer. You can check the list of supported languages by running\n            `python TTS/tts/utils/text/phonemizers/__init__.py`. Defaults to None.\n\n        compute_input_seq_cache (bool):\n            enable / disable precomputation of the phoneme sequences. At the expense of some delay at the beginning of\n            the training, It allows faster data loader time and precise limitation with `max_seq_len` and\n            `min_seq_len`.\n\n        text_cleaner (str):\n            Name of the text cleaner used for cleaning and formatting transcripts.\n\n        enable_eos_bos_chars (bool):\n            enable / disable the use of eos and bos characters.\n\n        test_senteces_file (str):\n            Path to a txt file that has sentences used at test time. The file must have a sentence per line.\n\n        phoneme_cache_path (str):\n            Path to the output folder caching the computed phonemes for each sample.\n\n        characters (CharactersConfig):\n            Instance of a CharactersConfig class.\n\n        batch_group_size (int):\n            Size of the batch groups used for bucketing. By default, the dataloader orders samples by the sequence\n            length for a more efficient and stable training. If `batch_group_size > 1` then it performs bucketing to\n            prevent using the same batches for each epoch.\n\n        loss_masking (bool):\n            enable / disable masking loss values against padded segments of samples in a batch.\n\n        min_text_len (int):\n            Minimum length of input text to be used. All shorter samples will be ignored. Defaults to 0.\n\n        max_text_len (int):\n            Maximum length of input text to be used. All longer samples will be ignored. Defaults to float(\"inf\").\n\n        min_audio_len (int):\n            Minimum length of input audio to be used. All shorter samples will be ignored. Defaults to 0.\n\n        max_audio_len (int):\n            Maximum length of input audio to be used. All longer samples will be ignored. The maximum length in the\n            dataset defines the VRAM used in the training. Hence, pay attention to this value if you encounter an\n            OOM error in training. Defaults to float(\"inf\").\n\n        compute_f0 (int):\n            (Not in use yet).\n\n        compute_energy (int):\n            (Not in use yet).\n\n        compute_linear_spec (bool):\n            If True data loader computes and returns linear spectrograms alongside the other data.\n\n        precompute_num_workers (int):\n            Number of workers to precompute features. Defaults to 0.\n\n        use_noise_augment (bool):\n            Augment the input audio with random noise.\n\n        start_by_longest (bool):\n            If True, the data loader will start loading the longest batch first. It is useful for checking OOM issues.\n            Defaults to False.\n\n        shuffle (bool):\n            If True, the data loader will shuffle the dataset when there is not sampler defined. Defaults to True.\n\n        drop_last (bool):\n            If True, the data loader will drop the last batch if it is not complete. It helps to prevent\n            issues that emerge from the partial batch statistics. Defaults to True.\n\n        add_blank (bool):\n            Add blank characters between each other two characters. It improves performance for some models at expense\n            of slower run-time due to the longer input sequence.\n\n        datasets (List[BaseDatasetConfig]):\n            List of datasets used for training. If multiple datasets are provided, they are merged and used together\n            for training.\n\n        optimizer (str):\n            Optimizer used for the training. Set one from `torch.optim.Optimizer` or `TTS.utils.training`.\n            Defaults to ``.\n\n        optimizer_params (dict):\n            Optimizer kwargs. Defaults to `{\"betas\": [0.8, 0.99], \"weight_decay\": 0.0}`\n\n        lr_scheduler (str):\n            Learning rate scheduler for the training. Use one from `torch.optim.Scheduler` schedulers or\n            `TTS.utils.training`. Defaults to ``.\n\n        lr_scheduler_params (dict):\n            Parameters for the generator learning rate scheduler. Defaults to `{\"warmup\": 4000}`.\n\n        test_sentences (List[str]):\n            List of sentences to be used at testing. Defaults to '[]'\n\n        eval_split_max_size (int):\n            Number maximum of samples to be used for evaluation in proportion split. Defaults to None (Disabled).\n\n        eval_split_size (float):\n            If between 0.0 and 1.0 represents the proportion of the dataset to include in the evaluation set.\n            If > 1, represents the absolute number of evaluation samples. Defaults to 0.01 (1%).\n\n        use_speaker_weighted_sampler (bool):\n            Enable / Disable the batch balancer by speaker. Defaults to ```False```.\n\n        speaker_weighted_sampler_alpha (float):\n            Number that control the influence of the speaker sampler weights. Defaults to ```1.0```.\n\n        use_language_weighted_sampler (bool):\n            Enable / Disable the batch balancer by language. Defaults to ```False```.\n\n        language_weighted_sampler_alpha (float):\n            Number that control the influence of the language sampler weights. Defaults to ```1.0```.\n\n        use_length_weighted_sampler (bool):\n            Enable / Disable the batch balancer by audio length. If enabled the dataset will be divided\n            into 10 buckets considering the min and max audio of the dataset. The sampler weights will be\n            computed forcing to have the same quantity of data for each bucket in each training batch. Defaults to ```False```.\n\n        length_weighted_sampler_alpha (float):\n            Number that control the influence of the length sampler weights. Defaults to ```1.0```.\n    \"\"\"\n\n    audio: BaseAudioConfig = field(default_factory=BaseAudioConfig)\n    # phoneme settings\n    use_phonemes: bool = False\n    phonemizer: str = None\n    phoneme_language: str = None\n    compute_input_seq_cache: bool = False\n    text_cleaner: str = None\n    enable_eos_bos_chars: bool = False\n    test_sentences_file: str = \"\"\n    phoneme_cache_path: str = None\n    # vocabulary parameters\n    characters: CharactersConfig = None\n    add_blank: bool = False\n    # training params\n    batch_group_size: int = 0\n    loss_masking: bool = None\n    # dataloading\n    min_audio_len: int = 1\n    max_audio_len: int = float(\"inf\")\n    min_text_len: int = 1\n    max_text_len: int = float(\"inf\")\n    compute_f0: bool = False\n    compute_energy: bool = False\n    compute_linear_spec: bool = False\n    precompute_num_workers: int = 0\n    use_noise_augment: bool = False\n    start_by_longest: bool = False\n    shuffle: bool = False\n    drop_last: bool = False\n    # dataset\n    datasets: List[BaseDatasetConfig] = field(default_factory=lambda: [BaseDatasetConfig()])\n    # optimizer\n    optimizer: str = \"radam\"\n    optimizer_params: dict = None\n    # scheduler\n    lr_scheduler: str = None\n    lr_scheduler_params: dict = field(default_factory=lambda: {})\n    # testing\n    test_sentences: List[str] = field(default_factory=lambda: [])\n    # evaluation\n    eval_split_max_size: int = None\n    eval_split_size: float = 0.01\n    # weighted samplers\n    use_speaker_weighted_sampler: bool = False\n    speaker_weighted_sampler_alpha: float = 1.0\n    use_language_weighted_sampler: bool = False\n    language_weighted_sampler_alpha: float = 1.0\n    use_length_weighted_sampler: bool = False\n    length_weighted_sampler_alpha: float = 1.0\n"
  },
  {
    "path": "TTS/tts/configs/speedy_speech_config.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import List\n\nfrom TTS.tts.configs.shared_configs import BaseTTSConfig\nfrom TTS.tts.models.forward_tts import ForwardTTSArgs\n\n\n@dataclass\nclass SpeedySpeechConfig(BaseTTSConfig):\n    \"\"\"Configure `ForwardTTS` as SpeedySpeech model.\n\n    Example:\n\n        >>> from TTS.tts.configs.speedy_speech_config import SpeedySpeechConfig\n        >>> config = SpeedySpeechConfig()\n\n     Args:\n        model (str):\n            Model name used for selecting the right model at initialization. Defaults to `speedy_speech`.\n\n        base_model (str):\n            Name of the base model being configured as this model so that 🐸 TTS knows it needs to initiate\n            the base model rather than searching for the `model` implementation. Defaults to `forward_tts`.\n\n        model_args (Coqpit):\n            Model class arguments. Check `FastPitchArgs` for more details. Defaults to `FastPitchArgs()`.\n\n        data_dep_init_steps (int):\n            Number of steps used for computing normalization parameters at the beginning of the training. GlowTTS uses\n            Activation Normalization that pre-computes normalization stats at the beginning and use the same values\n            for the rest. Defaults to 10.\n\n        speakers_file (str):\n            Path to the file containing the list of speakers. Needed at inference for loading matching speaker ids to\n            speaker names. Defaults to `None`.\n\n        use_speaker_embedding (bool):\n            enable / disable using speaker embeddings for multi-speaker models. If set True, the model is\n            in the multi-speaker mode. Defaults to False.\n\n        use_d_vector_file (bool):\n            enable /disable using external speaker embeddings in place of the learned embeddings. Defaults to False.\n\n        d_vector_file (str):\n            Path to the file including pre-computed speaker embeddings. Defaults to None.\n\n        d_vector_dim (int):\n            Dimension of the external speaker embeddings. Defaults to 0.\n\n        optimizer (str):\n            Name of the model optimizer. Defaults to `RAdam`.\n\n        optimizer_params (dict):\n            Arguments of the model optimizer. Defaults to `{\"betas\": [0.9, 0.998], \"weight_decay\": 1e-6}`.\n\n        lr_scheduler (str):\n            Name of the learning rate scheduler. Defaults to `Noam`.\n\n        lr_scheduler_params (dict):\n            Arguments of the learning rate scheduler. Defaults to `{\"warmup_steps\": 4000}`.\n\n        lr (float):\n            Initial learning rate. Defaults to `1e-3`.\n\n        grad_clip (float):\n            Gradient norm clipping value. Defaults to `5.0`.\n\n        spec_loss_type (str):\n            Type of the spectrogram loss. Check `ForwardTTSLoss` for possible values. Defaults to `l1`.\n\n        duration_loss_type (str):\n            Type of the duration loss. Check `ForwardTTSLoss` for possible values. Defaults to `huber`.\n\n        use_ssim_loss (bool):\n            Enable/disable the use of SSIM (Structural Similarity) loss. Defaults to True.\n\n        wd (float):\n            Weight decay coefficient. Defaults to `1e-7`.\n\n        ssim_loss_alpha (float):\n            Weight for the SSIM loss. If set 0, disables the SSIM loss. Defaults to 1.0.\n\n        dur_loss_alpha (float):\n            Weight for the duration predictor's loss. If set 0, disables the huber loss. Defaults to 1.0.\n\n        spec_loss_alpha (float):\n            Weight for the L1 spectrogram loss. If set 0, disables the L1 loss. Defaults to 1.0.\n\n        binary_loss_alpha (float):\n            Weight for the binary loss. If set 0, disables the binary loss. Defaults to 1.0.\n\n        binary_loss_warmup_epochs (float):\n            Number of epochs to gradually increase the binary loss impact. Defaults to 150.\n\n        min_seq_len (int):\n            Minimum input sequence length to be used at training.\n\n        max_seq_len (int):\n            Maximum input sequence length to be used at training. Larger values result in more VRAM usage.\n    \"\"\"\n\n    model: str = \"speedy_speech\"\n    base_model: str = \"forward_tts\"\n\n    # set model args as SpeedySpeech\n    model_args: ForwardTTSArgs = ForwardTTSArgs(\n        use_pitch=False,\n        encoder_type=\"residual_conv_bn\",\n        encoder_params={\n            \"kernel_size\": 4,\n            \"dilations\": 4 * [1, 2, 4] + [1],\n            \"num_conv_blocks\": 2,\n            \"num_res_blocks\": 13,\n        },\n        decoder_type=\"residual_conv_bn\",\n        decoder_params={\n            \"kernel_size\": 4,\n            \"dilations\": 4 * [1, 2, 4, 8] + [1],\n            \"num_conv_blocks\": 2,\n            \"num_res_blocks\": 17,\n        },\n        out_channels=80,\n        hidden_channels=128,\n        positional_encoding=True,\n        detach_duration_predictor=True,\n    )\n\n    # multi-speaker settings\n    num_speakers: int = 0\n    speakers_file: str = None\n    use_speaker_embedding: bool = False\n    use_d_vector_file: bool = False\n    d_vector_file: str = False\n    d_vector_dim: int = 0\n\n    # optimizer parameters\n    optimizer: str = \"Adam\"\n    optimizer_params: dict = field(default_factory=lambda: {\"betas\": [0.9, 0.998], \"weight_decay\": 1e-6})\n    lr_scheduler: str = \"NoamLR\"\n    lr_scheduler_params: dict = field(default_factory=lambda: {\"warmup_steps\": 4000})\n    lr: float = 1e-4\n    grad_clip: float = 5.0\n\n    # loss params\n    spec_loss_type: str = \"l1\"\n    duration_loss_type: str = \"huber\"\n    use_ssim_loss: bool = False\n    ssim_loss_alpha: float = 1.0\n    dur_loss_alpha: float = 1.0\n    spec_loss_alpha: float = 1.0\n    aligner_loss_alpha: float = 1.0\n    binary_align_loss_alpha: float = 0.3\n    binary_loss_warmup_epochs: int = 150\n\n    # overrides\n    min_seq_len: int = 13\n    max_seq_len: int = 200\n    r: int = 1  # DO NOT CHANGE\n\n    # dataset configs\n    compute_f0: bool = False\n    f0_cache_path: str = None\n\n    # testing\n    test_sentences: List[str] = field(\n        default_factory=lambda: [\n            \"It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.\",\n            \"Be a voice, not an echo.\",\n            \"I'm sorry Dave. I'm afraid I can't do that.\",\n            \"This cake is great. It's so delicious and moist.\",\n            \"Prior to November 22, 1963.\",\n        ]\n    )\n\n    def __post_init__(self):\n        # Pass multi-speaker parameters to the model args as `model.init_multispeaker()` looks for it there.\n        if self.num_speakers > 0:\n            self.model_args.num_speakers = self.num_speakers\n\n        # speaker embedding settings\n        if self.use_speaker_embedding:\n            self.model_args.use_speaker_embedding = True\n        if self.speakers_file:\n            self.model_args.speakers_file = self.speakers_file\n\n        # d-vector settings\n        if self.use_d_vector_file:\n            self.model_args.use_d_vector_file = True\n        if self.d_vector_dim is not None and self.d_vector_dim > 0:\n            self.model_args.d_vector_dim = self.d_vector_dim\n        if self.d_vector_file:\n            self.model_args.d_vector_file = self.d_vector_file\n"
  },
  {
    "path": "TTS/tts/configs/tacotron2_config.py",
    "content": "from dataclasses import dataclass\n\nfrom TTS.tts.configs.tacotron_config import TacotronConfig\n\n\n@dataclass\nclass Tacotron2Config(TacotronConfig):\n    \"\"\"Defines parameters for Tacotron2 based models.\n\n    Example:\n\n        >>> from TTS.tts.configs.tacotron2_config import Tacotron2Config\n        >>> config = Tacotron2Config()\n\n    Check `TacotronConfig` for argument descriptions.\n    \"\"\"\n\n    model: str = \"tacotron2\"\n    out_channels: int = 80\n    encoder_in_features: int = 512\n    decoder_in_features: int = 512\n"
  },
  {
    "path": "TTS/tts/configs/tacotron_config.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import List\n\nfrom TTS.tts.configs.shared_configs import BaseTTSConfig, CapacitronVAEConfig, GSTConfig\n\n\n@dataclass\nclass TacotronConfig(BaseTTSConfig):\n    \"\"\"Defines parameters for Tacotron based models.\n\n    Example:\n\n        >>> from TTS.tts.configs.tacotron_config import TacotronConfig\n        >>> config = TacotronConfig()\n\n    Args:\n        model (str):\n            Model name used to select the right model class to initilize. Defaults to `Tacotron`.\n        use_gst (bool):\n            enable / disable the use of Global Style Token modules. Defaults to False.\n        gst (GSTConfig):\n            Instance of `GSTConfig` class.\n        gst_style_input (str):\n            Path to the wav file used at inference to set the speech style through GST. If `GST` is enabled and\n            this is not defined, the model uses a zero vector as an input. Defaults to None.\n        use_capacitron_vae (bool):\n            enable / disable the use of Capacitron modules. Defaults to False.\n        capacitron_vae (CapacitronConfig):\n            Instance of `CapacitronConfig` class.\n        num_chars (int):\n            Number of characters used by the model. It must be defined before initializing the model. Defaults to None.\n        num_speakers (int):\n            Number of speakers for multi-speaker models. Defaults to 1.\n        r (int):\n            Initial number of output frames that the decoder computed per iteration. Larger values makes training and inference\n            faster but reduces the quality of the output frames. This must be equal to the largest `r` value used in\n            `gradual_training` schedule. Defaults to 1.\n        gradual_training (List[List]):\n            Parameters for the gradual training schedule. It is in the form `[[a, b, c], [d ,e ,f] ..]` where `a` is\n            the step number to start using the rest of the values, `b` is the `r` value and `c` is the batch size.\n            If sets None, no gradual training is used. Defaults to None.\n        memory_size (int):\n            Defines the number of previous frames used by the Prenet. If set to < 0, then it uses only the last frame.\n            Defaults to -1.\n        prenet_type (str):\n            `original` or `bn`. `original` sets the default Prenet and `bn` uses Batch Normalization version of the\n            Prenet. Defaults to `original`.\n        prenet_dropout (bool):\n            enables / disables the use of dropout in the Prenet. Defaults to True.\n        prenet_dropout_at_inference (bool):\n            enable / disable the use of dropout in the Prenet at the inference time. Defaults to False.\n        stopnet (bool):\n            enable /disable the Stopnet that predicts the end of the decoder sequence. Defaults to True.\n        stopnet_pos_weight (float):\n            Weight that is applied to over-weight positive instances in the Stopnet loss. Use larger values with\n            datasets with longer sentences. Defaults to 0.2.\n        max_decoder_steps (int):\n            Max number of steps allowed for the decoder. Defaults to 50.\n        encoder_in_features (int):\n            Channels of encoder input and character embedding tensors. Defaults to 256.\n        decoder_in_features (int):\n            Channels of decoder input and encoder output tensors. Defaults to 256.\n        out_channels (int):\n            Channels of the final model output. It must match the spectragram size. Defaults to 80.\n        separate_stopnet (bool):\n            Use a distinct Stopnet which is trained separately from the rest of the model. Defaults to True.\n        attention_type (str):\n            attention type. Check ```TTS.tts.layers.attentions.init_attn```. Defaults to 'original'.\n        attention_heads (int):\n            Number of attention heads for GMM attention. Defaults to 5.\n        windowing (bool):\n            It especially useful at inference to keep attention alignment diagonal. Defaults to False.\n        use_forward_attn (bool):\n            It is only valid if ```attn_type``` is ```original```.  Defaults to False.\n        forward_attn_mask (bool):\n            enable/disable extra masking over forward attention. It is useful at inference to prevent\n            possible attention failures. Defaults to False.\n        transition_agent (bool):\n            enable/disable transition agent in forward attention. Defaults to False.\n        location_attn (bool):\n            enable/disable location sensitive attention as in the original Tacotron2 paper.\n            It is only valid if ```attn_type``` is ```original```. Defaults to True.\n        bidirectional_decoder (bool):\n            enable/disable bidirectional decoding. Defaults to False.\n        double_decoder_consistency (bool):\n            enable/disable double decoder consistency. Defaults to False.\n        ddc_r (int):\n            reduction rate used by the coarse decoder when `double_decoder_consistency` is in use. Set this\n            as a multiple of the `r` value. Defaults to 6.\n        speakers_file (str):\n            Path to the speaker mapping file for the Speaker Manager. Defaults to None.\n        use_speaker_embedding (bool):\n            enable / disable using speaker embeddings for multi-speaker models. If set True, the model is\n            in the multi-speaker mode. Defaults to False.\n        use_d_vector_file (bool):\n            enable /disable using external speaker embeddings in place of the learned embeddings. Defaults to False.\n        d_vector_file (str):\n            Path to the file including pre-computed speaker embeddings. Defaults to None.\n        optimizer (str):\n            Optimizer used for the training. Set one from `torch.optim.Optimizer` or `TTS.utils.training`.\n            Defaults to `RAdam`.\n        optimizer_params (dict):\n            Optimizer kwargs. Defaults to `{\"betas\": [0.8, 0.99], \"weight_decay\": 0.0}`\n        lr_scheduler (str):\n            Learning rate scheduler for the training. Use one from `torch.optim.Scheduler` schedulers or\n            `TTS.utils.training`. Defaults to `NoamLR`.\n        lr_scheduler_params (dict):\n            Parameters for the generator learning rate scheduler. Defaults to `{\"warmup\": 4000}`.\n        lr (float):\n            Initial learning rate. Defaults to `1e-4`.\n        wd (float):\n            Weight decay coefficient. Defaults to `1e-6`.\n        grad_clip (float):\n            Gradient clipping threshold. Defaults to `5`.\n        seq_len_norm (bool):\n            enable / disable the sequnce length normalization in the loss functions. If set True, loss of a sample\n            is divided by the sequence length. Defaults to False.\n        loss_masking (bool):\n            enable / disable masking the paddings of the samples in loss computation. Defaults to True.\n        decoder_loss_alpha (float):\n            Weight for the decoder loss of the Tacotron model. If set less than or equal to zero, it disables the\n            corresponding loss function. Defaults to 0.25\n        postnet_loss_alpha (float):\n            Weight for the postnet loss of the Tacotron model. If set less than or equal to zero, it disables the\n            corresponding loss function. Defaults to 0.25\n        postnet_diff_spec_alpha (float):\n            Weight for the postnet differential loss of the Tacotron model. If set less than or equal to zero, it disables the\n            corresponding loss function. Defaults to 0.25\n        decoder_diff_spec_alpha (float):\n\n            Weight for the decoder differential loss of the Tacotron model. If set less than or equal to zero, it disables the\n            corresponding loss function. Defaults to 0.25\n        decoder_ssim_alpha (float):\n            Weight for the decoder SSIM loss of the Tacotron model. If set less than or equal to zero, it disables the\n            corresponding loss function. Defaults to 0.25\n        postnet_ssim_alpha (float):\n            Weight for the postnet SSIM loss of the Tacotron model. If set less than or equal to zero, it disables the\n            corresponding loss function. Defaults to 0.25\n        ga_alpha (float):\n            Weight for the guided attention loss. If set less than or equal to zero, it disables the corresponding loss\n            function. Defaults to 5.\n    \"\"\"\n\n    model: str = \"tacotron\"\n    # model_params: TacotronArgs = field(default_factory=lambda: TacotronArgs())\n    use_gst: bool = False\n    gst: GSTConfig = None\n    gst_style_input: str = None\n\n    use_capacitron_vae: bool = False\n    capacitron_vae: CapacitronVAEConfig = None\n\n    # model specific params\n    num_speakers: int = 1\n    num_chars: int = 0\n    r: int = 2\n    gradual_training: List[List[int]] = None\n    memory_size: int = -1\n    prenet_type: str = \"original\"\n    prenet_dropout: bool = True\n    prenet_dropout_at_inference: bool = False\n    stopnet: bool = True\n    separate_stopnet: bool = True\n    stopnet_pos_weight: float = 0.2\n    max_decoder_steps: int = 10000\n    encoder_in_features: int = 256\n    decoder_in_features: int = 256\n    decoder_output_dim: int = 80\n    out_channels: int = 513\n\n    # attention layers\n    attention_type: str = \"original\"\n    attention_heads: int = None\n    attention_norm: str = \"sigmoid\"\n    attention_win: bool = False\n    windowing: bool = False\n    use_forward_attn: bool = False\n    forward_attn_mask: bool = False\n    transition_agent: bool = False\n    location_attn: bool = True\n\n    # advance methods\n    bidirectional_decoder: bool = False\n    double_decoder_consistency: bool = False\n    ddc_r: int = 6\n\n    # multi-speaker settings\n    speakers_file: str = None\n    use_speaker_embedding: bool = False\n    speaker_embedding_dim: int = 512\n    use_d_vector_file: bool = False\n    d_vector_file: str = False\n    d_vector_dim: int = None\n\n    # optimizer parameters\n    optimizer: str = \"RAdam\"\n    optimizer_params: dict = field(default_factory=lambda: {\"betas\": [0.9, 0.998], \"weight_decay\": 1e-6})\n    lr_scheduler: str = \"NoamLR\"\n    lr_scheduler_params: dict = field(default_factory=lambda: {\"warmup_steps\": 4000})\n    lr: float = 1e-4\n    grad_clip: float = 5.0\n    seq_len_norm: bool = False\n    loss_masking: bool = True\n\n    # loss params\n    decoder_loss_alpha: float = 0.25\n    postnet_loss_alpha: float = 0.25\n    postnet_diff_spec_alpha: float = 0.25\n    decoder_diff_spec_alpha: float = 0.25\n    decoder_ssim_alpha: float = 0.25\n    postnet_ssim_alpha: float = 0.25\n    ga_alpha: float = 5.0\n\n    # testing\n    test_sentences: List[str] = field(\n        default_factory=lambda: [\n            \"It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.\",\n            \"Be a voice, not an echo.\",\n            \"I'm sorry Dave. I'm afraid I can't do that.\",\n            \"This cake is great. It's so delicious and moist.\",\n            \"Prior to November 22, 1963.\",\n        ]\n    )\n\n    def check_values(self):\n        if self.gradual_training:\n            assert (\n                self.gradual_training[0][1] == self.r\n            ), f\"[!] the first scheduled gradual training `r` must be equal to the model's `r` value. {self.gradual_training[0][1]} vs {self.r}\"\n        if self.model == \"tacotron\" and self.audio is not None:\n            assert self.out_channels == (\n                self.audio.fft_size // 2 + 1\n            ), f\"{self.out_channels} vs {self.audio.fft_size // 2 + 1}\"\n        if self.model == \"tacotron2\" and self.audio is not None:\n            assert self.out_channels == self.audio.num_mels\n"
  },
  {
    "path": "TTS/tts/configs/vits_config.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import List\n\nfrom TTS.tts.configs.shared_configs import BaseTTSConfig\nfrom TTS.tts.models.vits import VitsArgs, VitsAudioConfig\n\n\n@dataclass\nclass VitsConfig(BaseTTSConfig):\n    \"\"\"Defines parameters for VITS End2End TTS model.\n\n    Args:\n        model (str):\n            Model name. Do not change unless you know what you are doing.\n\n        model_args (VitsArgs):\n            Model architecture arguments. Defaults to `VitsArgs()`.\n\n        audio (VitsAudioConfig):\n            Audio processing configuration. Defaults to `VitsAudioConfig()`.\n\n        grad_clip (List):\n            Gradient clipping thresholds for each optimizer. Defaults to `[1000.0, 1000.0]`.\n\n        lr_gen (float):\n            Initial learning rate for the generator. Defaults to 0.0002.\n\n        lr_disc (float):\n            Initial learning rate for the discriminator. Defaults to 0.0002.\n\n        lr_scheduler_gen (str):\n            Name of the learning rate scheduler for the generator. One of the `torch.optim.lr_scheduler.*`. Defaults to\n            `ExponentialLR`.\n\n        lr_scheduler_gen_params (dict):\n            Parameters for the learning rate scheduler of the generator. Defaults to `{'gamma': 0.999875, \"last_epoch\":-1}`.\n\n        lr_scheduler_disc (str):\n            Name of the learning rate scheduler for the discriminator. One of the `torch.optim.lr_scheduler.*`. Defaults to\n            `ExponentialLR`.\n\n        lr_scheduler_disc_params (dict):\n            Parameters for the learning rate scheduler of the discriminator. Defaults to `{'gamma': 0.999875, \"last_epoch\":-1}`.\n\n        scheduler_after_epoch (bool):\n            If true, step the schedulers after each epoch else after each step. Defaults to `False`.\n\n        optimizer (str):\n            Name of the optimizer to use with both the generator and the discriminator networks. One of the\n            `torch.optim.*`. Defaults to `AdamW`.\n\n        kl_loss_alpha (float):\n            Loss weight for KL loss. Defaults to 1.0.\n\n        disc_loss_alpha (float):\n            Loss weight for the discriminator loss. Defaults to 1.0.\n\n        gen_loss_alpha (float):\n            Loss weight for the generator loss. Defaults to 1.0.\n\n        feat_loss_alpha (float):\n            Loss weight for the feature matching loss. Defaults to 1.0.\n\n        mel_loss_alpha (float):\n            Loss weight for the mel loss. Defaults to 45.0.\n\n        return_wav (bool):\n            If true, data loader returns the waveform as well as the other outputs. Do not change. Defaults to `True`.\n\n        compute_linear_spec (bool):\n            If true, the linear spectrogram is computed and returned alongside the mel output. Do not change. Defaults to `True`.\n\n        use_weighted_sampler (bool):\n            If true, use weighted sampler with bucketing for balancing samples between datasets used in training. Defaults to `False`.\n\n        weighted_sampler_attrs (dict):\n            Key retuned by the formatter to be used for weighted sampler. For example `{\"root_path\": 2.0, \"speaker_name\": 1.0}` sets sample probabilities\n            by overweighting `root_path` by 2.0. Defaults to `{}`.\n\n        weighted_sampler_multipliers (dict):\n            Weight each unique value of a key returned by the formatter for weighted sampling.\n            For example `{\"root_path\":{\"/raid/datasets/libritts-clean-16khz-bwe-coqui_44khz/LibriTTS/train-clean-100/\":1.0, \"/raid/datasets/libritts-clean-16khz-bwe-coqui_44khz/LibriTTS/train-clean-360/\": 0.5}`.\n            It will sample instances from `train-clean-100` 2 times more than `train-clean-360`. Defaults to `{}`.\n\n        r (int):\n            Number of spectrogram frames to be generated at a time. Do not change. Defaults to `1`.\n\n        add_blank (bool):\n            If true, a blank token is added in between every character. Defaults to `True`.\n\n        test_sentences (List[List]):\n            List of sentences with speaker and language information to be used for testing.\n\n        language_ids_file (str):\n            Path to the language ids file.\n\n        use_language_embedding (bool):\n            If true, language embedding is used. Defaults to `False`.\n\n    Note:\n        Check :class:`TTS.tts.configs.shared_configs.BaseTTSConfig` for the inherited parameters.\n\n    Example:\n\n        >>> from TTS.tts.configs.vits_config import VitsConfig\n        >>> config = VitsConfig()\n    \"\"\"\n\n    model: str = \"vits\"\n    # model specific params\n    model_args: VitsArgs = field(default_factory=VitsArgs)\n    audio: VitsAudioConfig = VitsAudioConfig()\n\n    # optimizer\n    grad_clip: List[float] = field(default_factory=lambda: [1000, 1000])\n    lr_gen: float = 0.0002\n    lr_disc: float = 0.0002\n    lr_scheduler_gen: str = \"ExponentialLR\"\n    lr_scheduler_gen_params: dict = field(default_factory=lambda: {\"gamma\": 0.999875, \"last_epoch\": -1})\n    lr_scheduler_disc: str = \"ExponentialLR\"\n    lr_scheduler_disc_params: dict = field(default_factory=lambda: {\"gamma\": 0.999875, \"last_epoch\": -1})\n    scheduler_after_epoch: bool = True\n    optimizer: str = \"AdamW\"\n    optimizer_params: dict = field(default_factory=lambda: {\"betas\": [0.8, 0.99], \"eps\": 1e-9, \"weight_decay\": 0.01})\n\n    # loss params\n    kl_loss_alpha: float = 1.0\n    disc_loss_alpha: float = 1.0\n    gen_loss_alpha: float = 1.0\n    feat_loss_alpha: float = 1.0\n    mel_loss_alpha: float = 45.0\n    dur_loss_alpha: float = 1.0\n    speaker_encoder_loss_alpha: float = 1.0\n\n    # data loader params\n    return_wav: bool = True\n    compute_linear_spec: bool = True\n\n    # sampler params\n    use_weighted_sampler: bool = False  # TODO: move it to the base config\n    weighted_sampler_attrs: dict = field(default_factory=lambda: {})\n    weighted_sampler_multipliers: dict = field(default_factory=lambda: {})\n\n    # overrides\n    r: int = 1  # DO NOT CHANGE\n    add_blank: bool = True\n\n    # testing\n    test_sentences: List[List] = field(\n        default_factory=lambda: [\n            [\"It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.\"],\n            [\"Be a voice, not an echo.\"],\n            [\"I'm sorry Dave. I'm afraid I can't do that.\"],\n            [\"This cake is great. It's so delicious and moist.\"],\n            [\"Prior to November 22, 1963.\"],\n        ]\n    )\n\n    # multi-speaker settings\n    # use speaker embedding layer\n    num_speakers: int = 0\n    use_speaker_embedding: bool = False\n    speakers_file: str = None\n    speaker_embedding_channels: int = 256\n    language_ids_file: str = None\n    use_language_embedding: bool = False\n\n    # use d-vectors\n    use_d_vector_file: bool = False\n    d_vector_file: List[str] = None\n    d_vector_dim: int = None\n\n    def __post_init__(self):\n        for key, val in self.model_args.items():\n            if hasattr(self, key):\n                self[key] = val\n"
  },
  {
    "path": "TTS/tts/datasets/__init__.py",
    "content": "import os\nimport sys\nfrom collections import Counter\nfrom pathlib import Path\nfrom typing import Callable, Dict, List, Tuple, Union\n\nimport numpy as np\n\nfrom TTS.tts.datasets.dataset import *\nfrom TTS.tts.datasets.formatters import *\n\n\ndef split_dataset(items, eval_split_max_size=None, eval_split_size=0.01):\n    \"\"\"Split a dataset into train and eval. Consider speaker distribution in multi-speaker training.\n\n    Args:\n        items (List[List]):\n            A list of samples. Each sample is a list of `[audio_path, text, speaker_id]`.\n\n        eval_split_max_size (int):\n            Number maximum of samples to be used for evaluation in proportion split. Defaults to None (Disabled).\n\n        eval_split_size (float):\n            If between 0.0 and 1.0 represents the proportion of the dataset to include in the evaluation set.\n            If > 1, represents the absolute number of evaluation samples. Defaults to 0.01 (1%).\n    \"\"\"\n    speakers = [item[\"speaker_name\"] for item in items]\n    is_multi_speaker = len(set(speakers)) > 1\n    if eval_split_size > 1:\n        eval_split_size = int(eval_split_size)\n    else:\n        if eval_split_max_size:\n            eval_split_size = min(eval_split_max_size, int(len(items) * eval_split_size))\n        else:\n            eval_split_size = int(len(items) * eval_split_size)\n\n    assert (\n        eval_split_size > 0\n    ), \" [!] You do not have enough samples for the evaluation set. You can work around this setting the 'eval_split_size' parameter to a minimum of {}\".format(\n        1 / len(items)\n    )\n    np.random.seed(0)\n    np.random.shuffle(items)\n    if is_multi_speaker:\n        items_eval = []\n        speakers = [item[\"speaker_name\"] for item in items]\n        speaker_counter = Counter(speakers)\n        while len(items_eval) < eval_split_size:\n            item_idx = np.random.randint(0, len(items))\n            speaker_to_be_removed = items[item_idx][\"speaker_name\"]\n            if speaker_counter[speaker_to_be_removed] > 1:\n                items_eval.append(items[item_idx])\n                speaker_counter[speaker_to_be_removed] -= 1\n                del items[item_idx]\n        return items_eval, items\n    return items[:eval_split_size], items[eval_split_size:]\n\n\ndef add_extra_keys(metadata, language, dataset_name):\n    for item in metadata:\n        # add language name\n        item[\"language\"] = language\n        # add unique audio name\n        relfilepath = os.path.splitext(os.path.relpath(item[\"audio_file\"], item[\"root_path\"]))[0]\n        audio_unique_name = f\"{dataset_name}#{relfilepath}\"\n        item[\"audio_unique_name\"] = audio_unique_name\n    return metadata\n\n\ndef load_tts_samples(\n    datasets: Union[List[Dict], Dict],\n    eval_split=True,\n    formatter: Callable = None,\n    eval_split_max_size=None,\n    eval_split_size=0.01,\n) -> Tuple[List[List], List[List]]:\n    \"\"\"Parse the dataset from the datasets config, load the samples as a List and load the attention alignments if provided.\n    If `formatter` is not None, apply the formatter to the samples else pick the formatter from the available ones based\n    on the dataset name.\n\n    Args:\n        datasets (List[Dict], Dict): A list of datasets or a single dataset dictionary. If multiple datasets are\n            in the list, they are all merged.\n\n        eval_split (bool, optional): If true, create a evaluation split. If an eval split provided explicitly, generate\n            an eval split automatically. Defaults to True.\n\n        formatter (Callable, optional): The preprocessing function to be applied to create the list of samples. It\n            must take the root_path and the meta_file name and return a list of samples in the format of\n            `[[text, audio_path, speaker_id], ...]]`. See the available formatters in `TTS.tts.dataset.formatter` as\n            example. Defaults to None.\n\n        eval_split_max_size (int):\n            Number maximum of samples to be used for evaluation in proportion split. Defaults to None (Disabled).\n\n        eval_split_size (float):\n            If between 0.0 and 1.0 represents the proportion of the dataset to include in the evaluation set.\n            If > 1, represents the absolute number of evaluation samples. Defaults to 0.01 (1%).\n\n    Returns:\n        Tuple[List[List], List[List]: training and evaluation splits of the dataset.\n    \"\"\"\n    meta_data_train_all = []\n    meta_data_eval_all = [] if eval_split else None\n    if not isinstance(datasets, list):\n        datasets = [datasets]\n    for dataset in datasets:\n        formatter_name = dataset[\"formatter\"]\n        dataset_name = dataset[\"dataset_name\"]\n        root_path = dataset[\"path\"]\n        meta_file_train = dataset[\"meta_file_train\"]\n        meta_file_val = dataset[\"meta_file_val\"]\n        ignored_speakers = dataset[\"ignored_speakers\"]\n        language = dataset[\"language\"]\n\n        # setup the right data processor\n        if formatter is None:\n            formatter = _get_formatter_by_name(formatter_name)\n        # load train set\n        meta_data_train = formatter(root_path, meta_file_train, ignored_speakers=ignored_speakers)\n        assert len(meta_data_train) > 0, f\" [!] No training samples found in {root_path}/{meta_file_train}\"\n\n        meta_data_train = add_extra_keys(meta_data_train, language, dataset_name)\n\n        print(f\" | > Found {len(meta_data_train)} files in {Path(root_path).resolve()}\")\n        # load evaluation split if set\n        if eval_split:\n            if meta_file_val:\n                meta_data_eval = formatter(root_path, meta_file_val, ignored_speakers=ignored_speakers)\n                meta_data_eval = add_extra_keys(meta_data_eval, language, dataset_name)\n            else:\n                eval_size_per_dataset = eval_split_max_size // len(datasets) if eval_split_max_size else None\n                meta_data_eval, meta_data_train = split_dataset(meta_data_train, eval_size_per_dataset, eval_split_size)\n            meta_data_eval_all += meta_data_eval\n        meta_data_train_all += meta_data_train\n        # load attention masks for the duration predictor training\n        if dataset.meta_file_attn_mask:\n            meta_data = dict(load_attention_mask_meta_data(dataset[\"meta_file_attn_mask\"]))\n            for idx, ins in enumerate(meta_data_train_all):\n                attn_file = meta_data[ins[\"audio_file\"]].strip()\n                meta_data_train_all[idx].update({\"alignment_file\": attn_file})\n            if meta_data_eval_all:\n                for idx, ins in enumerate(meta_data_eval_all):\n                    attn_file = meta_data[ins[\"audio_file\"]].strip()\n                    meta_data_eval_all[idx].update({\"alignment_file\": attn_file})\n        # set none for the next iter\n        formatter = None\n    return meta_data_train_all, meta_data_eval_all\n\n\ndef load_attention_mask_meta_data(metafile_path):\n    \"\"\"Load meta data file created by compute_attention_masks.py\"\"\"\n    with open(metafile_path, \"r\", encoding=\"utf-8\") as f:\n        lines = f.readlines()\n\n    meta_data = []\n    for line in lines:\n        wav_file, attn_file = line.split(\"|\")\n        meta_data.append([wav_file, attn_file])\n    return meta_data\n\n\ndef _get_formatter_by_name(name):\n    \"\"\"Returns the respective preprocessing function.\"\"\"\n    thismodule = sys.modules[__name__]\n    return getattr(thismodule, name.lower())\n\n\ndef find_unique_chars(data_samples, verbose=True):\n    texts = \"\".join(item[0] for item in data_samples)\n    chars = set(texts)\n    lower_chars = filter(lambda c: c.islower(), chars)\n    chars_force_lower = [c.lower() for c in chars]\n    chars_force_lower = set(chars_force_lower)\n\n    if verbose:\n        print(f\" > Number of unique characters: {len(chars)}\")\n        print(f\" > Unique characters: {''.join(sorted(chars))}\")\n        print(f\" > Unique lower characters: {''.join(sorted(lower_chars))}\")\n        print(f\" > Unique all forced to lower characters: {''.join(sorted(chars_force_lower))}\")\n    return chars_force_lower\n"
  },
  {
    "path": "TTS/tts/datasets/dataset.py",
    "content": "import base64\nimport collections\nimport os\nimport random\nfrom typing import Dict, List, Union\n\nimport numpy as np\nimport torch\nimport tqdm\nfrom torch.utils.data import Dataset\n\nfrom TTS.tts.utils.data import prepare_data, prepare_stop_target, prepare_tensor\nfrom TTS.utils.audio import AudioProcessor\nfrom TTS.utils.audio.numpy_transforms import compute_energy as calculate_energy\n\n# to prevent too many open files error as suggested here\n# https://github.com/pytorch/pytorch/issues/11201#issuecomment-421146936\ntorch.multiprocessing.set_sharing_strategy(\"file_system\")\n\n\ndef _parse_sample(item):\n    language_name = None\n    attn_file = None\n    if len(item) == 5:\n        text, wav_file, speaker_name, language_name, attn_file = item\n    elif len(item) == 4:\n        text, wav_file, speaker_name, language_name = item\n    elif len(item) == 3:\n        text, wav_file, speaker_name = item\n    else:\n        raise ValueError(\" [!] Dataset cannot parse the sample.\")\n    return text, wav_file, speaker_name, language_name, attn_file\n\n\ndef noise_augment_audio(wav):\n    return wav + (1.0 / 32768.0) * np.random.rand(*wav.shape)\n\n\ndef string2filename(string):\n    # generate a safe and reversible filename based on a string\n    filename = base64.urlsafe_b64encode(string.encode(\"utf-8\")).decode(\"utf-8\", \"ignore\")\n    return filename\n\n\nclass TTSDataset(Dataset):\n    def __init__(\n        self,\n        outputs_per_step: int = 1,\n        compute_linear_spec: bool = False,\n        ap: AudioProcessor = None,\n        samples: List[Dict] = None,\n        tokenizer: \"TTSTokenizer\" = None,\n        compute_f0: bool = False,\n        compute_energy: bool = False,\n        f0_cache_path: str = None,\n        energy_cache_path: str = None,\n        return_wav: bool = False,\n        batch_group_size: int = 0,\n        min_text_len: int = 0,\n        max_text_len: int = float(\"inf\"),\n        min_audio_len: int = 0,\n        max_audio_len: int = float(\"inf\"),\n        phoneme_cache_path: str = None,\n        precompute_num_workers: int = 0,\n        speaker_id_mapping: Dict = None,\n        d_vector_mapping: Dict = None,\n        language_id_mapping: Dict = None,\n        use_noise_augment: bool = False,\n        start_by_longest: bool = False,\n        verbose: bool = False,\n    ):\n        \"\"\"Generic 📂 data loader for `tts` models. It is configurable for different outputs and needs.\n\n        If you need something different, you can subclass and override.\n\n        Args:\n            outputs_per_step (int): Number of time frames predicted per step.\n\n            compute_linear_spec (bool): compute linear spectrogram if True.\n\n            ap (TTS.tts.utils.AudioProcessor): Audio processor object.\n\n            samples (list): List of dataset samples.\n\n            tokenizer (TTSTokenizer): tokenizer to convert text to sequence IDs. If None init internally else\n                use the given. Defaults to None.\n\n            compute_f0 (bool): compute f0 if True. Defaults to False.\n\n            compute_energy (bool): compute energy if True. Defaults to False.\n\n            f0_cache_path (str): Path to store f0 cache. Defaults to None.\n\n            energy_cache_path (str): Path to store energy cache. Defaults to None.\n\n            return_wav (bool): Return the waveform of the sample. Defaults to False.\n\n            batch_group_size (int): Range of batch randomization after sorting\n                sequences by length. It shuffles each batch with bucketing to gather similar lenght sequences in a\n                batch. Set 0 to disable. Defaults to 0.\n\n            min_text_len (int): Minimum length of input text to be used. All shorter samples will be ignored.\n                Defaults to 0.\n\n            max_text_len (int): Maximum length of input text to be used. All longer samples will be ignored.\n                Defaults to float(\"inf\").\n\n            min_audio_len (int): Minimum length of input audio to be used. All shorter samples will be ignored.\n                Defaults to 0.\n\n            max_audio_len (int): Maximum length of input audio to be used. All longer samples will be ignored.\n                The maximum length in the dataset defines the VRAM used in the training. Hence, pay attention to\n                this value if you encounter an OOM error in training. Defaults to float(\"inf\").\n\n            phoneme_cache_path (str): Path to cache computed phonemes. It writes phonemes of each sample to a\n                separate file. Defaults to None.\n\n            precompute_num_workers (int): Number of workers to precompute features. Defaults to 0.\n\n            speaker_id_mapping (dict): Mapping of speaker names to IDs used to compute embedding vectors by the\n                embedding layer. Defaults to None.\n\n            d_vector_mapping (dict): Mapping of wav files to computed d-vectors. Defaults to None.\n\n            use_noise_augment (bool): Enable adding random noise to wav for augmentation. Defaults to False.\n\n            start_by_longest (bool): Start by longest sequence. It is especially useful to check OOM. Defaults to False.\n\n            verbose (bool): Print diagnostic information. Defaults to false.\n        \"\"\"\n        super().__init__()\n        self.batch_group_size = batch_group_size\n        self._samples = samples\n        self.outputs_per_step = outputs_per_step\n        self.compute_linear_spec = compute_linear_spec\n        self.return_wav = return_wav\n        self.compute_f0 = compute_f0\n        self.compute_energy = compute_energy\n        self.f0_cache_path = f0_cache_path\n        self.energy_cache_path = energy_cache_path\n        self.min_audio_len = min_audio_len\n        self.max_audio_len = max_audio_len\n        self.min_text_len = min_text_len\n        self.max_text_len = max_text_len\n        self.ap = ap\n        self.phoneme_cache_path = phoneme_cache_path\n        self.speaker_id_mapping = speaker_id_mapping\n        self.d_vector_mapping = d_vector_mapping\n        self.language_id_mapping = language_id_mapping\n        self.use_noise_augment = use_noise_augment\n        self.start_by_longest = start_by_longest\n\n        self.verbose = verbose\n        self.rescue_item_idx = 1\n        self.pitch_computed = False\n        self.tokenizer = tokenizer\n\n        if self.tokenizer.use_phonemes:\n            self.phoneme_dataset = PhonemeDataset(\n                self.samples, self.tokenizer, phoneme_cache_path, precompute_num_workers=precompute_num_workers\n            )\n\n        if compute_f0:\n            self.f0_dataset = F0Dataset(\n                self.samples, self.ap, cache_path=f0_cache_path, precompute_num_workers=precompute_num_workers\n            )\n        if compute_energy:\n            self.energy_dataset = EnergyDataset(\n                self.samples, self.ap, cache_path=energy_cache_path, precompute_num_workers=precompute_num_workers\n            )\n        if self.verbose:\n            self.print_logs()\n\n    @property\n    def lengths(self):\n        lens = []\n        for item in self.samples:\n            _, wav_file, *_ = _parse_sample(item)\n            audio_len = os.path.getsize(wav_file) / 16 * 8  # assuming 16bit audio\n            lens.append(audio_len)\n        return lens\n\n    @property\n    def samples(self):\n        return self._samples\n\n    @samples.setter\n    def samples(self, new_samples):\n        self._samples = new_samples\n        if hasattr(self, \"f0_dataset\"):\n            self.f0_dataset.samples = new_samples\n        if hasattr(self, \"energy_dataset\"):\n            self.energy_dataset.samples = new_samples\n        if hasattr(self, \"phoneme_dataset\"):\n            self.phoneme_dataset.samples = new_samples\n\n    def __len__(self):\n        return len(self.samples)\n\n    def __getitem__(self, idx):\n        return self.load_data(idx)\n\n    def print_logs(self, level: int = 0) -> None:\n        indent = \"\\t\" * level\n        print(\"\\n\")\n        print(f\"{indent}> DataLoader initialization\")\n        print(f\"{indent}| > Tokenizer:\")\n        self.tokenizer.print_logs(level + 1)\n        print(f\"{indent}| > Number of instances : {len(self.samples)}\")\n\n    def load_wav(self, filename):\n        waveform = self.ap.load_wav(filename)\n        assert waveform.size > 0\n        return waveform\n\n    def get_phonemes(self, idx, text):\n        out_dict = self.phoneme_dataset[idx]\n        assert text == out_dict[\"text\"], f\"{text} != {out_dict['text']}\"\n        assert len(out_dict[\"token_ids\"]) > 0\n        return out_dict\n\n    def get_f0(self, idx):\n        out_dict = self.f0_dataset[idx]\n        item = self.samples[idx]\n        assert item[\"audio_unique_name\"] == out_dict[\"audio_unique_name\"]\n        return out_dict\n\n    def get_energy(self, idx):\n        out_dict = self.energy_dataset[idx]\n        item = self.samples[idx]\n        assert item[\"audio_unique_name\"] == out_dict[\"audio_unique_name\"]\n        return out_dict\n\n    @staticmethod\n    def get_attn_mask(attn_file):\n        return np.load(attn_file)\n\n    def get_token_ids(self, idx, text):\n        if self.tokenizer.use_phonemes:\n            token_ids = self.get_phonemes(idx, text)[\"token_ids\"]\n        else:\n            token_ids = self.tokenizer.text_to_ids(text)\n        return np.array(token_ids, dtype=np.int32)\n\n    def load_data(self, idx):\n        item = self.samples[idx]\n\n        raw_text = item[\"text\"]\n\n        wav = np.asarray(self.load_wav(item[\"audio_file\"]), dtype=np.float32)\n\n        # apply noise for augmentation\n        if self.use_noise_augment:\n            wav = noise_augment_audio(wav)\n\n        # get token ids\n        token_ids = self.get_token_ids(idx, item[\"text\"])\n\n        # get pre-computed attention maps\n        attn = None\n        if \"alignment_file\" in item:\n            attn = self.get_attn_mask(item[\"alignment_file\"])\n\n        # after phonemization the text length may change\n        # this is a shareful 🤭 hack to prevent longer phonemes\n        # TODO: find a better fix\n        if len(token_ids) > self.max_text_len or len(wav) < self.min_audio_len:\n            self.rescue_item_idx += 1\n            return self.load_data(self.rescue_item_idx)\n\n        # get f0 values\n        f0 = None\n        if self.compute_f0:\n            f0 = self.get_f0(idx)[\"f0\"]\n        energy = None\n        if self.compute_energy:\n            energy = self.get_energy(idx)[\"energy\"]\n\n        sample = {\n            \"raw_text\": raw_text,\n            \"token_ids\": token_ids,\n            \"wav\": wav,\n            \"pitch\": f0,\n            \"energy\": energy,\n            \"attn\": attn,\n            \"item_idx\": item[\"audio_file\"],\n            \"speaker_name\": item[\"speaker_name\"],\n            \"language_name\": item[\"language\"],\n            \"wav_file_name\": os.path.basename(item[\"audio_file\"]),\n            \"audio_unique_name\": item[\"audio_unique_name\"],\n        }\n        return sample\n\n    @staticmethod\n    def _compute_lengths(samples):\n        new_samples = []\n        for item in samples:\n            audio_length = os.path.getsize(item[\"audio_file\"]) / 16 * 8  # assuming 16bit audio\n            text_lenght = len(item[\"text\"])\n            item[\"audio_length\"] = audio_length\n            item[\"text_length\"] = text_lenght\n            new_samples += [item]\n        return new_samples\n\n    @staticmethod\n    def filter_by_length(lengths: List[int], min_len: int, max_len: int):\n        idxs = np.argsort(lengths)  # ascending order\n        ignore_idx = []\n        keep_idx = []\n        for idx in idxs:\n            length = lengths[idx]\n            if length < min_len or length > max_len:\n                ignore_idx.append(idx)\n            else:\n                keep_idx.append(idx)\n        return ignore_idx, keep_idx\n\n    @staticmethod\n    def sort_by_length(samples: List[List]):\n        audio_lengths = [s[\"audio_length\"] for s in samples]\n        idxs = np.argsort(audio_lengths)  # ascending order\n        return idxs\n\n    @staticmethod\n    def create_buckets(samples, batch_group_size: int):\n        assert batch_group_size > 0\n        for i in range(len(samples) // batch_group_size):\n            offset = i * batch_group_size\n            end_offset = offset + batch_group_size\n            temp_items = samples[offset:end_offset]\n            random.shuffle(temp_items)\n            samples[offset:end_offset] = temp_items\n        return samples\n\n    @staticmethod\n    def _select_samples_by_idx(idxs, samples):\n        samples_new = []\n        for idx in idxs:\n            samples_new.append(samples[idx])\n        return samples_new\n\n    def preprocess_samples(self):\n        r\"\"\"Sort `items` based on text length or audio length in ascending order. Filter out samples out or the length\n        range.\n        \"\"\"\n        samples = self._compute_lengths(self.samples)\n\n        # sort items based on the sequence length in ascending order\n        text_lengths = [i[\"text_length\"] for i in samples]\n        audio_lengths = [i[\"audio_length\"] for i in samples]\n        text_ignore_idx, text_keep_idx = self.filter_by_length(text_lengths, self.min_text_len, self.max_text_len)\n        audio_ignore_idx, audio_keep_idx = self.filter_by_length(audio_lengths, self.min_audio_len, self.max_audio_len)\n        keep_idx = list(set(audio_keep_idx) & set(text_keep_idx))\n        ignore_idx = list(set(audio_ignore_idx) | set(text_ignore_idx))\n\n        samples = self._select_samples_by_idx(keep_idx, samples)\n\n        sorted_idxs = self.sort_by_length(samples)\n\n        if self.start_by_longest:\n            longest_idxs = sorted_idxs[-1]\n            sorted_idxs[-1] = sorted_idxs[0]\n            sorted_idxs[0] = longest_idxs\n\n        samples = self._select_samples_by_idx(sorted_idxs, samples)\n\n        if len(samples) == 0:\n            raise RuntimeError(\" [!] No samples left\")\n\n        # shuffle batch groups\n        # create batches with similar length items\n        # the larger the `batch_group_size`, the higher the length variety in a batch.\n        if self.batch_group_size > 0:\n            samples = self.create_buckets(samples, self.batch_group_size)\n\n        # update items to the new sorted items\n        audio_lengths = [s[\"audio_length\"] for s in samples]\n        text_lengths = [s[\"text_length\"] for s in samples]\n        self.samples = samples\n\n        if self.verbose:\n            print(\" | > Preprocessing samples\")\n            print(\" | > Max text length: {}\".format(np.max(text_lengths)))\n            print(\" | > Min text length: {}\".format(np.min(text_lengths)))\n            print(\" | > Avg text length: {}\".format(np.mean(text_lengths)))\n            print(\" | \")\n            print(\" | > Max audio length: {}\".format(np.max(audio_lengths)))\n            print(\" | > Min audio length: {}\".format(np.min(audio_lengths)))\n            print(\" | > Avg audio length: {}\".format(np.mean(audio_lengths)))\n            print(f\" | > Num. instances discarded samples: {len(ignore_idx)}\")\n            print(\" | > Batch group size: {}.\".format(self.batch_group_size))\n\n    @staticmethod\n    def _sort_batch(batch, text_lengths):\n        \"\"\"Sort the batch by the input text length for RNN efficiency.\n\n        Args:\n            batch (Dict): Batch returned by `__getitem__`.\n            text_lengths (List[int]): Lengths of the input character sequences.\n        \"\"\"\n        text_lengths, ids_sorted_decreasing = torch.sort(torch.LongTensor(text_lengths), dim=0, descending=True)\n        batch = [batch[idx] for idx in ids_sorted_decreasing]\n        return batch, text_lengths, ids_sorted_decreasing\n\n    def collate_fn(self, batch):\n        r\"\"\"\n        Perform preprocessing and create a final data batch:\n        1. Sort batch instances by text-length\n        2. Convert Audio signal to features.\n        3. PAD sequences wrt r.\n        4. Load to Torch.\n        \"\"\"\n\n        # Puts each data field into a tensor with outer dimension batch size\n        if isinstance(batch[0], collections.abc.Mapping):\n            token_ids_lengths = np.array([len(d[\"token_ids\"]) for d in batch])\n\n            # sort items with text input length for RNN efficiency\n            batch, token_ids_lengths, ids_sorted_decreasing = self._sort_batch(batch, token_ids_lengths)\n\n            # convert list of dicts to dict of lists\n            batch = {k: [dic[k] for dic in batch] for k in batch[0]}\n\n            # get language ids from language names\n            if self.language_id_mapping is not None:\n                language_ids = [self.language_id_mapping[ln] for ln in batch[\"language_name\"]]\n            else:\n                language_ids = None\n            # get pre-computed d-vectors\n            if self.d_vector_mapping is not None:\n                embedding_keys = list(batch[\"audio_unique_name\"])\n                d_vectors = [self.d_vector_mapping[w][\"embedding\"] for w in embedding_keys]\n            else:\n                d_vectors = None\n\n            # get numerical speaker ids from speaker names\n            if self.speaker_id_mapping:\n                speaker_ids = [self.speaker_id_mapping[sn] for sn in batch[\"speaker_name\"]]\n            else:\n                speaker_ids = None\n            # compute features\n            mel = [self.ap.melspectrogram(w).astype(\"float32\") for w in batch[\"wav\"]]\n\n            mel_lengths = [m.shape[1] for m in mel]\n\n            # lengths adjusted by the reduction factor\n            mel_lengths_adjusted = [\n                m.shape[1] + (self.outputs_per_step - (m.shape[1] % self.outputs_per_step))\n                if m.shape[1] % self.outputs_per_step\n                else m.shape[1]\n                for m in mel\n            ]\n\n            # compute 'stop token' targets\n            stop_targets = [np.array([0.0] * (mel_len - 1) + [1.0]) for mel_len in mel_lengths]\n\n            # PAD stop targets\n            stop_targets = prepare_stop_target(stop_targets, self.outputs_per_step)\n\n            # PAD sequences with longest instance in the batch\n            token_ids = prepare_data(batch[\"token_ids\"]).astype(np.int32)\n\n            # PAD features with longest instance\n            mel = prepare_tensor(mel, self.outputs_per_step)\n\n            # B x D x T --> B x T x D\n            mel = mel.transpose(0, 2, 1)\n\n            # convert things to pytorch\n            token_ids_lengths = torch.LongTensor(token_ids_lengths)\n            token_ids = torch.LongTensor(token_ids)\n            mel = torch.FloatTensor(mel).contiguous()\n            mel_lengths = torch.LongTensor(mel_lengths)\n            stop_targets = torch.FloatTensor(stop_targets)\n\n            # speaker vectors\n            if d_vectors is not None:\n                d_vectors = torch.FloatTensor(d_vectors)\n\n            if speaker_ids is not None:\n                speaker_ids = torch.LongTensor(speaker_ids)\n\n            if language_ids is not None:\n                language_ids = torch.LongTensor(language_ids)\n\n            # compute linear spectrogram\n            linear = None\n            if self.compute_linear_spec:\n                linear = [self.ap.spectrogram(w).astype(\"float32\") for w in batch[\"wav\"]]\n                linear = prepare_tensor(linear, self.outputs_per_step)\n                linear = linear.transpose(0, 2, 1)\n                assert mel.shape[1] == linear.shape[1]\n                linear = torch.FloatTensor(linear).contiguous()\n\n            # format waveforms\n            wav_padded = None\n            if self.return_wav:\n                wav_lengths = [w.shape[0] for w in batch[\"wav\"]]\n                max_wav_len = max(mel_lengths_adjusted) * self.ap.hop_length\n                wav_lengths = torch.LongTensor(wav_lengths)\n                wav_padded = torch.zeros(len(batch[\"wav\"]), 1, max_wav_len)\n                for i, w in enumerate(batch[\"wav\"]):\n                    mel_length = mel_lengths_adjusted[i]\n                    w = np.pad(w, (0, self.ap.hop_length * self.outputs_per_step), mode=\"edge\")\n                    w = w[: mel_length * self.ap.hop_length]\n                    wav_padded[i, :, : w.shape[0]] = torch.from_numpy(w)\n                wav_padded.transpose_(1, 2)\n\n            # format F0\n            if self.compute_f0:\n                pitch = prepare_data(batch[\"pitch\"])\n                assert mel.shape[1] == pitch.shape[1], f\"[!] {mel.shape} vs {pitch.shape}\"\n                pitch = torch.FloatTensor(pitch)[:, None, :].contiguous()  # B x 1 xT\n            else:\n                pitch = None\n            # format energy\n            if self.compute_energy:\n                energy = prepare_data(batch[\"energy\"])\n                assert mel.shape[1] == energy.shape[1], f\"[!] {mel.shape} vs {energy.shape}\"\n                energy = torch.FloatTensor(energy)[:, None, :].contiguous()  # B x 1 xT\n            else:\n                energy = None\n            # format attention masks\n            attns = None\n            if batch[\"attn\"][0] is not None:\n                attns = [batch[\"attn\"][idx].T for idx in ids_sorted_decreasing]\n                for idx, attn in enumerate(attns):\n                    pad2 = mel.shape[1] - attn.shape[1]\n                    pad1 = token_ids.shape[1] - attn.shape[0]\n                    assert pad1 >= 0 and pad2 >= 0, f\"[!] Negative padding - {pad1} and {pad2}\"\n                    attn = np.pad(attn, [[0, pad1], [0, pad2]])\n                    attns[idx] = attn\n                attns = prepare_tensor(attns, self.outputs_per_step)\n                attns = torch.FloatTensor(attns).unsqueeze(1)\n\n            return {\n                \"token_id\": token_ids,\n                \"token_id_lengths\": token_ids_lengths,\n                \"speaker_names\": batch[\"speaker_name\"],\n                \"linear\": linear,\n                \"mel\": mel,\n                \"mel_lengths\": mel_lengths,\n                \"stop_targets\": stop_targets,\n                \"item_idxs\": batch[\"item_idx\"],\n                \"d_vectors\": d_vectors,\n                \"speaker_ids\": speaker_ids,\n                \"attns\": attns,\n                \"waveform\": wav_padded,\n                \"raw_text\": batch[\"raw_text\"],\n                \"pitch\": pitch,\n                \"energy\": energy,\n                \"language_ids\": language_ids,\n                \"audio_unique_names\": batch[\"audio_unique_name\"],\n            }\n\n        raise TypeError(\n            (\n                \"batch must contain tensors, numbers, dicts or lists;\\\n                         found {}\".format(\n                    type(batch[0])\n                )\n            )\n        )\n\n\nclass PhonemeDataset(Dataset):\n    \"\"\"Phoneme Dataset for converting input text to phonemes and then token IDs\n\n    At initialization, it pre-computes the phonemes under `cache_path` and loads them in training to reduce data\n    loading latency. If `cache_path` is already present, it skips the pre-computation.\n\n    Args:\n        samples (Union[List[List], List[Dict]]):\n            List of samples. Each sample is a list or a dict.\n\n        tokenizer (TTSTokenizer):\n            Tokenizer to convert input text to phonemes.\n\n        cache_path (str):\n            Path to cache phonemes. If `cache_path` is already present or None, it skips the pre-computation.\n\n        precompute_num_workers (int):\n            Number of workers used for pre-computing the phonemes. Defaults to 0.\n    \"\"\"\n\n    def __init__(\n        self,\n        samples: Union[List[Dict], List[List]],\n        tokenizer: \"TTSTokenizer\",\n        cache_path: str,\n        precompute_num_workers=0,\n    ):\n        self.samples = samples\n        self.tokenizer = tokenizer\n        self.cache_path = cache_path\n        if cache_path is not None and not os.path.exists(cache_path):\n            os.makedirs(cache_path)\n            self.precompute(precompute_num_workers)\n\n    def __getitem__(self, index):\n        item = self.samples[index]\n        ids = self.compute_or_load(string2filename(item[\"audio_unique_name\"]), item[\"text\"], item[\"language\"])\n        ph_hat = self.tokenizer.ids_to_text(ids)\n        return {\"text\": item[\"text\"], \"ph_hat\": ph_hat, \"token_ids\": ids, \"token_ids_len\": len(ids)}\n\n    def __len__(self):\n        return len(self.samples)\n\n    def compute_or_load(self, file_name, text, language):\n        \"\"\"Compute phonemes for the given text.\n\n        If the phonemes are already cached, load them from cache.\n        \"\"\"\n        file_ext = \"_phoneme.npy\"\n        cache_path = os.path.join(self.cache_path, file_name + file_ext)\n        try:\n            ids = np.load(cache_path)\n        except FileNotFoundError:\n            ids = self.tokenizer.text_to_ids(text, language=language)\n            np.save(cache_path, ids)\n        return ids\n\n    def get_pad_id(self):\n        \"\"\"Get pad token ID for sequence padding\"\"\"\n        return self.tokenizer.pad_id\n\n    def precompute(self, num_workers=1):\n        \"\"\"Precompute phonemes for all samples.\n\n        We use pytorch dataloader because we are lazy.\n        \"\"\"\n        print(\"[*] Pre-computing phonemes...\")\n        with tqdm.tqdm(total=len(self)) as pbar:\n            batch_size = num_workers if num_workers > 0 else 1\n            dataloder = torch.utils.data.DataLoader(\n                batch_size=batch_size, dataset=self, shuffle=False, num_workers=num_workers, collate_fn=self.collate_fn\n            )\n            for _ in dataloder:\n                pbar.update(batch_size)\n\n    def collate_fn(self, batch):\n        ids = [item[\"token_ids\"] for item in batch]\n        ids_lens = [item[\"token_ids_len\"] for item in batch]\n        texts = [item[\"text\"] for item in batch]\n        texts_hat = [item[\"ph_hat\"] for item in batch]\n        ids_lens_max = max(ids_lens)\n        ids_torch = torch.LongTensor(len(ids), ids_lens_max).fill_(self.get_pad_id())\n        for i, ids_len in enumerate(ids_lens):\n            ids_torch[i, :ids_len] = torch.LongTensor(ids[i])\n        return {\"text\": texts, \"ph_hat\": texts_hat, \"token_ids\": ids_torch}\n\n    def print_logs(self, level: int = 0) -> None:\n        indent = \"\\t\" * level\n        print(\"\\n\")\n        print(f\"{indent}> PhonemeDataset \")\n        print(f\"{indent}| > Tokenizer:\")\n        self.tokenizer.print_logs(level + 1)\n        print(f\"{indent}| > Number of instances : {len(self.samples)}\")\n\n\nclass F0Dataset:\n    \"\"\"F0 Dataset for computing F0 from wav files in CPU\n\n    Pre-compute F0 values for all the samples at initialization if `cache_path` is not None or already present. It\n    also computes the mean and std of F0 values if `normalize_f0` is True.\n\n    Args:\n        samples (Union[List[List], List[Dict]]):\n            List of samples. Each sample is a list or a dict.\n\n        ap (AudioProcessor):\n            AudioProcessor to compute F0 from wav files.\n\n        cache_path (str):\n            Path to cache F0 values. If `cache_path` is already present or None, it skips the pre-computation.\n            Defaults to None.\n\n        precompute_num_workers (int):\n            Number of workers used for pre-computing the F0 values. Defaults to 0.\n\n        normalize_f0 (bool):\n            Whether to normalize F0 values by mean and std. Defaults to True.\n    \"\"\"\n\n    def __init__(\n        self,\n        samples: Union[List[List], List[Dict]],\n        ap: \"AudioProcessor\",\n        verbose=False,\n        cache_path: str = None,\n        precompute_num_workers=0,\n        normalize_f0=True,\n    ):\n        self.samples = samples\n        self.ap = ap\n        self.verbose = verbose\n        self.cache_path = cache_path\n        self.normalize_f0 = normalize_f0\n        self.pad_id = 0.0\n        self.mean = None\n        self.std = None\n        if cache_path is not None and not os.path.exists(cache_path):\n            os.makedirs(cache_path)\n            self.precompute(precompute_num_workers)\n        if normalize_f0:\n            self.load_stats(cache_path)\n\n    def __getitem__(self, idx):\n        item = self.samples[idx]\n        f0 = self.compute_or_load(item[\"audio_file\"], string2filename(item[\"audio_unique_name\"]))\n        if self.normalize_f0:\n            assert self.mean is not None and self.std is not None, \" [!] Mean and STD is not available\"\n            f0 = self.normalize(f0)\n        return {\"audio_unique_name\": item[\"audio_unique_name\"], \"f0\": f0}\n\n    def __len__(self):\n        return len(self.samples)\n\n    def precompute(self, num_workers=0):\n        print(\"[*] Pre-computing F0s...\")\n        with tqdm.tqdm(total=len(self)) as pbar:\n            batch_size = num_workers if num_workers > 0 else 1\n            # we do not normalize at preproessing\n            normalize_f0 = self.normalize_f0\n            self.normalize_f0 = False\n            dataloder = torch.utils.data.DataLoader(\n                batch_size=batch_size, dataset=self, shuffle=False, num_workers=num_workers, collate_fn=self.collate_fn\n            )\n            computed_data = []\n            for batch in dataloder:\n                f0 = batch[\"f0\"]\n                computed_data.append(f for f in f0)\n                pbar.update(batch_size)\n            self.normalize_f0 = normalize_f0\n\n        if self.normalize_f0:\n            computed_data = [tensor for batch in computed_data for tensor in batch]  # flatten\n            pitch_mean, pitch_std = self.compute_pitch_stats(computed_data)\n            pitch_stats = {\"mean\": pitch_mean, \"std\": pitch_std}\n            np.save(os.path.join(self.cache_path, \"pitch_stats\"), pitch_stats, allow_pickle=True)\n\n    def get_pad_id(self):\n        return self.pad_id\n\n    @staticmethod\n    def create_pitch_file_path(file_name, cache_path):\n        pitch_file = os.path.join(cache_path, file_name + \"_pitch.npy\")\n        return pitch_file\n\n    @staticmethod\n    def _compute_and_save_pitch(ap, wav_file, pitch_file=None):\n        wav = ap.load_wav(wav_file)\n        pitch = ap.compute_f0(wav)\n        if pitch_file:\n            np.save(pitch_file, pitch)\n        return pitch\n\n    @staticmethod\n    def compute_pitch_stats(pitch_vecs):\n        nonzeros = np.concatenate([v[np.where(v != 0.0)[0]] for v in pitch_vecs])\n        mean, std = np.mean(nonzeros), np.std(nonzeros)\n        return mean, std\n\n    def load_stats(self, cache_path):\n        stats_path = os.path.join(cache_path, \"pitch_stats.npy\")\n        stats = np.load(stats_path, allow_pickle=True).item()\n        self.mean = stats[\"mean\"].astype(np.float32)\n        self.std = stats[\"std\"].astype(np.float32)\n\n    def normalize(self, pitch):\n        zero_idxs = np.where(pitch == 0.0)[0]\n        pitch = pitch - self.mean\n        pitch = pitch / self.std\n        pitch[zero_idxs] = 0.0\n        return pitch\n\n    def denormalize(self, pitch):\n        zero_idxs = np.where(pitch == 0.0)[0]\n        pitch *= self.std\n        pitch += self.mean\n        pitch[zero_idxs] = 0.0\n        return pitch\n\n    def compute_or_load(self, wav_file, audio_unique_name):\n        \"\"\"\n        compute pitch and return a numpy array of pitch values\n        \"\"\"\n        pitch_file = self.create_pitch_file_path(audio_unique_name, self.cache_path)\n        if not os.path.exists(pitch_file):\n            pitch = self._compute_and_save_pitch(self.ap, wav_file, pitch_file)\n        else:\n            pitch = np.load(pitch_file)\n        return pitch.astype(np.float32)\n\n    def collate_fn(self, batch):\n        audio_unique_name = [item[\"audio_unique_name\"] for item in batch]\n        f0s = [item[\"f0\"] for item in batch]\n        f0_lens = [len(item[\"f0\"]) for item in batch]\n        f0_lens_max = max(f0_lens)\n        f0s_torch = torch.LongTensor(len(f0s), f0_lens_max).fill_(self.get_pad_id())\n        for i, f0_len in enumerate(f0_lens):\n            f0s_torch[i, :f0_len] = torch.LongTensor(f0s[i])\n        return {\"audio_unique_name\": audio_unique_name, \"f0\": f0s_torch, \"f0_lens\": f0_lens}\n\n    def print_logs(self, level: int = 0) -> None:\n        indent = \"\\t\" * level\n        print(\"\\n\")\n        print(f\"{indent}> F0Dataset \")\n        print(f\"{indent}| > Number of instances : {len(self.samples)}\")\n\n\nclass EnergyDataset:\n    \"\"\"Energy Dataset for computing Energy from wav files in CPU\n\n    Pre-compute Energy values for all the samples at initialization if `cache_path` is not None or already present. It\n    also computes the mean and std of Energy values if `normalize_Energy` is True.\n\n    Args:\n        samples (Union[List[List], List[Dict]]):\n            List of samples. Each sample is a list or a dict.\n\n        ap (AudioProcessor):\n            AudioProcessor to compute Energy from wav files.\n\n        cache_path (str):\n            Path to cache Energy values. If `cache_path` is already present or None, it skips the pre-computation.\n            Defaults to None.\n\n        precompute_num_workers (int):\n            Number of workers used for pre-computing the Energy values. Defaults to 0.\n\n        normalize_Energy (bool):\n            Whether to normalize Energy values by mean and std. Defaults to True.\n    \"\"\"\n\n    def __init__(\n        self,\n        samples: Union[List[List], List[Dict]],\n        ap: \"AudioProcessor\",\n        verbose=False,\n        cache_path: str = None,\n        precompute_num_workers=0,\n        normalize_energy=True,\n    ):\n        self.samples = samples\n        self.ap = ap\n        self.verbose = verbose\n        self.cache_path = cache_path\n        self.normalize_energy = normalize_energy\n        self.pad_id = 0.0\n        self.mean = None\n        self.std = None\n        if cache_path is not None and not os.path.exists(cache_path):\n            os.makedirs(cache_path)\n            self.precompute(precompute_num_workers)\n        if normalize_energy:\n            self.load_stats(cache_path)\n\n    def __getitem__(self, idx):\n        item = self.samples[idx]\n        energy = self.compute_or_load(item[\"audio_file\"], string2filename(item[\"audio_unique_name\"]))\n        if self.normalize_energy:\n            assert self.mean is not None and self.std is not None, \" [!] Mean and STD is not available\"\n            energy = self.normalize(energy)\n        return {\"audio_unique_name\": item[\"audio_unique_name\"], \"energy\": energy}\n\n    def __len__(self):\n        return len(self.samples)\n\n    def precompute(self, num_workers=0):\n        print(\"[*] Pre-computing energys...\")\n        with tqdm.tqdm(total=len(self)) as pbar:\n            batch_size = num_workers if num_workers > 0 else 1\n            # we do not normalize at preproessing\n            normalize_energy = self.normalize_energy\n            self.normalize_energy = False\n            dataloder = torch.utils.data.DataLoader(\n                batch_size=batch_size, dataset=self, shuffle=False, num_workers=num_workers, collate_fn=self.collate_fn\n            )\n            computed_data = []\n            for batch in dataloder:\n                energy = batch[\"energy\"]\n                computed_data.append(e for e in energy)\n                pbar.update(batch_size)\n            self.normalize_energy = normalize_energy\n\n        if self.normalize_energy:\n            computed_data = [tensor for batch in computed_data for tensor in batch]  # flatten\n            energy_mean, energy_std = self.compute_energy_stats(computed_data)\n            energy_stats = {\"mean\": energy_mean, \"std\": energy_std}\n            np.save(os.path.join(self.cache_path, \"energy_stats\"), energy_stats, allow_pickle=True)\n\n    def get_pad_id(self):\n        return self.pad_id\n\n    @staticmethod\n    def create_energy_file_path(wav_file, cache_path):\n        file_name = os.path.splitext(os.path.basename(wav_file))[0]\n        energy_file = os.path.join(cache_path, file_name + \"_energy.npy\")\n        return energy_file\n\n    @staticmethod\n    def _compute_and_save_energy(ap, wav_file, energy_file=None):\n        wav = ap.load_wav(wav_file)\n        energy = calculate_energy(wav, fft_size=ap.fft_size, hop_length=ap.hop_length, win_length=ap.win_length)\n        if energy_file:\n            np.save(energy_file, energy)\n        return energy\n\n    @staticmethod\n    def compute_energy_stats(energy_vecs):\n        nonzeros = np.concatenate([v[np.where(v != 0.0)[0]] for v in energy_vecs])\n        mean, std = np.mean(nonzeros), np.std(nonzeros)\n        return mean, std\n\n    def load_stats(self, cache_path):\n        stats_path = os.path.join(cache_path, \"energy_stats.npy\")\n        stats = np.load(stats_path, allow_pickle=True).item()\n        self.mean = stats[\"mean\"].astype(np.float32)\n        self.std = stats[\"std\"].astype(np.float32)\n\n    def normalize(self, energy):\n        zero_idxs = np.where(energy == 0.0)[0]\n        energy = energy - self.mean\n        energy = energy / self.std\n        energy[zero_idxs] = 0.0\n        return energy\n\n    def denormalize(self, energy):\n        zero_idxs = np.where(energy == 0.0)[0]\n        energy *= self.std\n        energy += self.mean\n        energy[zero_idxs] = 0.0\n        return energy\n\n    def compute_or_load(self, wav_file, audio_unique_name):\n        \"\"\"\n        compute energy and return a numpy array of energy values\n        \"\"\"\n        energy_file = self.create_energy_file_path(audio_unique_name, self.cache_path)\n        if not os.path.exists(energy_file):\n            energy = self._compute_and_save_energy(self.ap, wav_file, energy_file)\n        else:\n            energy = np.load(energy_file)\n        return energy.astype(np.float32)\n\n    def collate_fn(self, batch):\n        audio_unique_name = [item[\"audio_unique_name\"] for item in batch]\n        energys = [item[\"energy\"] for item in batch]\n        energy_lens = [len(item[\"energy\"]) for item in batch]\n        energy_lens_max = max(energy_lens)\n        energys_torch = torch.LongTensor(len(energys), energy_lens_max).fill_(self.get_pad_id())\n        for i, energy_len in enumerate(energy_lens):\n            energys_torch[i, :energy_len] = torch.LongTensor(energys[i])\n        return {\"audio_unique_name\": audio_unique_name, \"energy\": energys_torch, \"energy_lens\": energy_lens}\n\n    def print_logs(self, level: int = 0) -> None:\n        indent = \"\\t\" * level\n        print(\"\\n\")\n        print(f\"{indent}> energyDataset \")\n        print(f\"{indent}| > Number of instances : {len(self.samples)}\")\n"
  },
  {
    "path": "TTS/tts/datasets/formatters.py",
    "content": "import os\nimport re\nimport xml.etree.ElementTree as ET\nfrom glob import glob\nfrom pathlib import Path\nfrom typing import List\n\nimport pandas as pd\nfrom tqdm import tqdm\n\n########################\n# DATASETS\n########################\n\n\ndef coqui(root_path, meta_file, ignored_speakers=None):\n    \"\"\"Interal dataset formatter.\"\"\"\n    filepath = os.path.join(root_path, meta_file)\n    # ensure there are 4 columns for every line\n    with open(filepath, \"r\", encoding=\"utf8\") as f:\n        lines = f.readlines()\n    num_cols = len(lines[0].split(\"|\"))  # take the first row as reference\n    for idx, line in enumerate(lines[1:]):\n        if len(line.split(\"|\")) != num_cols:\n            print(f\" > Missing column in line {idx + 1} -> {line.strip()}\")\n    # load metadata\n    metadata = pd.read_csv(os.path.join(root_path, meta_file), sep=\"|\")\n    assert all(x in metadata.columns for x in [\"audio_file\", \"text\"])\n    speaker_name = None if \"speaker_name\" in metadata.columns else \"coqui\"\n    emotion_name = None if \"emotion_name\" in metadata.columns else \"neutral\"\n    items = []\n    not_found_counter = 0\n    for row in metadata.itertuples():\n        if speaker_name is None and ignored_speakers is not None and row.speaker_name in ignored_speakers:\n            continue\n        audio_path = os.path.join(root_path, row.audio_file)\n        if not os.path.exists(audio_path):\n            not_found_counter += 1\n            continue\n        items.append(\n            {\n                \"text\": row.text,\n                \"audio_file\": audio_path,\n                \"speaker_name\": speaker_name if speaker_name is not None else row.speaker_name,\n                \"emotion_name\": emotion_name if emotion_name is not None else row.emotion_name,\n                \"root_path\": root_path,\n            }\n        )\n    if not_found_counter > 0:\n        print(f\" | > [!] {not_found_counter} files not found\")\n    return items\n\n\ndef tweb(root_path, meta_file, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"Normalize TWEB dataset.\n    https://www.kaggle.com/bryanpark/the-world-english-bible-speech-dataset\n    \"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    speaker_name = \"tweb\"\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            cols = line.split(\"\\t\")\n            wav_file = os.path.join(root_path, cols[0] + \".wav\")\n            text = cols[1]\n            items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name, \"root_path\": root_path})\n    return items\n\n\ndef mozilla(root_path, meta_file, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"Normalizes Mozilla meta data files to TTS format\"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    speaker_name = \"mozilla\"\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            cols = line.split(\"|\")\n            wav_file = cols[1].strip()\n            text = cols[0].strip()\n            wav_file = os.path.join(root_path, \"wavs\", wav_file)\n            items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name, \"root_path\": root_path})\n    return items\n\n\ndef mozilla_de(root_path, meta_file, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"Normalizes Mozilla meta data files to TTS format\"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    speaker_name = \"mozilla\"\n    with open(txt_file, \"r\", encoding=\"ISO 8859-1\") as ttf:\n        for line in ttf:\n            cols = line.strip().split(\"|\")\n            wav_file = cols[0].strip()\n            text = cols[1].strip()\n            folder_name = f\"BATCH_{wav_file.split('_')[0]}_FINAL\"\n            wav_file = os.path.join(root_path, folder_name, wav_file)\n            items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name, \"root_path\": root_path})\n    return items\n\n\ndef mailabs(root_path, meta_files=None, ignored_speakers=None):\n    \"\"\"Normalizes M-AI-Labs meta data files to TTS format\n\n    Args:\n        root_path (str): root folder of the MAILAB language folder.\n        meta_files (str):  list of meta files to be used in the training. If None, finds all the csv files\n            recursively. Defaults to None\n    \"\"\"\n    speaker_regex = re.compile(f\"by_book{os.sep}(male|female){os.sep}(?P<speaker_name>[^{os.sep}]+){os.sep}\")\n    if not meta_files:\n        csv_files = glob(root_path + f\"{os.sep}**{os.sep}metadata.csv\", recursive=True)\n    else:\n        csv_files = meta_files\n\n    # meta_files = [f.strip() for f in meta_files.split(\",\")]\n    items = []\n    for csv_file in csv_files:\n        if os.path.isfile(csv_file):\n            txt_file = csv_file\n        else:\n            txt_file = os.path.join(root_path, csv_file)\n\n        folder = os.path.dirname(txt_file)\n        # determine speaker based on folder structure...\n        speaker_name_match = speaker_regex.search(txt_file)\n        if speaker_name_match is None:\n            continue\n        speaker_name = speaker_name_match.group(\"speaker_name\")\n        # ignore speakers\n        if isinstance(ignored_speakers, list):\n            if speaker_name in ignored_speakers:\n                continue\n        print(\" | > {}\".format(csv_file))\n        with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n            for line in ttf:\n                cols = line.split(\"|\")\n                if not meta_files:\n                    wav_file = os.path.join(folder, \"wavs\", cols[0] + \".wav\")\n                else:\n                    wav_file = os.path.join(root_path, folder.replace(\"metadata.csv\", \"\"), \"wavs\", cols[0] + \".wav\")\n                if os.path.isfile(wav_file):\n                    text = cols[1].strip()\n                    items.append(\n                        {\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name, \"root_path\": root_path}\n                    )\n                else:\n                    # M-AI-Labs have some missing samples, so just print the warning\n                    print(\"> File %s does not exist!\" % (wav_file))\n    return items\n\n\ndef ljspeech(root_path, meta_file, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"Normalizes the LJSpeech meta data file to TTS format\n    https://keithito.com/LJ-Speech-Dataset/\"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    speaker_name = \"ljspeech\"\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            cols = line.split(\"|\")\n            wav_file = os.path.join(root_path, \"wavs\", cols[0] + \".wav\")\n            text = cols[2]\n            items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name, \"root_path\": root_path})\n    return items\n\n\ndef ljspeech_test(root_path, meta_file, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"Normalizes the LJSpeech meta data file for TTS testing\n    https://keithito.com/LJ-Speech-Dataset/\"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        speaker_id = 0\n        for idx, line in enumerate(ttf):\n            # 2 samples per speaker to avoid eval split issues\n            if idx % 2 == 0:\n                speaker_id += 1\n            cols = line.split(\"|\")\n            wav_file = os.path.join(root_path, \"wavs\", cols[0] + \".wav\")\n            text = cols[2]\n            items.append(\n                {\"text\": text, \"audio_file\": wav_file, \"speaker_name\": f\"ljspeech-{speaker_id}\", \"root_path\": root_path}\n            )\n    return items\n\n\ndef thorsten(root_path, meta_file, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"Normalizes the thorsten meta data file to TTS format\n    https://github.com/thorstenMueller/deep-learning-german-tts/\"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    speaker_name = \"thorsten\"\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            cols = line.split(\"|\")\n            wav_file = os.path.join(root_path, \"wavs\", cols[0] + \".wav\")\n            text = cols[1]\n            items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name, \"root_path\": root_path})\n    return items\n\n\ndef sam_accenture(root_path, meta_file, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"Normalizes the sam-accenture meta data file to TTS format\n    https://github.com/Sam-Accenture-Non-Binary-Voice/non-binary-voice-files\"\"\"\n    xml_file = os.path.join(root_path, \"voice_over_recordings\", meta_file)\n    xml_root = ET.parse(xml_file).getroot()\n    items = []\n    speaker_name = \"sam_accenture\"\n    for item in xml_root.findall(\"./fileid\"):\n        text = item.text\n        wav_file = os.path.join(root_path, \"vo_voice_quality_transformation\", item.get(\"id\") + \".wav\")\n        if not os.path.exists(wav_file):\n            print(f\" [!] {wav_file} in metafile does not exist. Skipping...\")\n            continue\n        items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name, \"root_path\": root_path})\n    return items\n\n\ndef ruslan(root_path, meta_file, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"Normalizes the RUSLAN meta data file to TTS format\n    https://ruslan-corpus.github.io/\"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    speaker_name = \"ruslan\"\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            cols = line.split(\"|\")\n            wav_file = os.path.join(root_path, \"RUSLAN\", cols[0] + \".wav\")\n            text = cols[1]\n            items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name, \"root_path\": root_path})\n    return items\n\n\ndef css10(root_path, meta_file, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"Normalizes the CSS10 dataset file to TTS format\"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    speaker_name = \"css10\"\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            cols = line.split(\"|\")\n            wav_file = os.path.join(root_path, cols[0])\n            text = cols[1]\n            items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name})\n    return items\n\n\ndef nancy(root_path, meta_file, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"Normalizes the Nancy meta data file to TTS format\"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    speaker_name = \"nancy\"\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            utt_id = line.split()[1]\n            text = line[line.find('\"') + 1 : line.rfind('\"') - 1]\n            wav_file = os.path.join(root_path, \"wavn\", utt_id + \".wav\")\n            items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name})\n    return items\n\n\ndef common_voice(root_path, meta_file, ignored_speakers=None):\n    \"\"\"Normalize the common voice meta data file to TTS format.\"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            if line.startswith(\"client_id\"):\n                continue\n            cols = line.split(\"\\t\")\n            text = cols[2]\n            speaker_name = cols[0]\n            # ignore speakers\n            if isinstance(ignored_speakers, list):\n                if speaker_name in ignored_speakers:\n                    continue\n            wav_file = os.path.join(root_path, \"clips\", cols[1].replace(\".mp3\", \".wav\"))\n            items.append(\n                {\"text\": text, \"audio_file\": wav_file, \"speaker_name\": \"MCV_\" + speaker_name, \"root_path\": root_path}\n            )\n    return items\n\n\ndef libri_tts(root_path, meta_files=None, ignored_speakers=None):\n    \"\"\"https://ai.google/tools/datasets/libri-tts/\"\"\"\n    items = []\n    if not meta_files:\n        meta_files = glob(f\"{root_path}/**/*trans.tsv\", recursive=True)\n    else:\n        if isinstance(meta_files, str):\n            meta_files = [os.path.join(root_path, meta_files)]\n\n    for meta_file in meta_files:\n        _meta_file = os.path.basename(meta_file).split(\".\")[0]\n        with open(meta_file, \"r\", encoding=\"utf-8\") as ttf:\n            for line in ttf:\n                cols = line.split(\"\\t\")\n                file_name = cols[0]\n                speaker_name, chapter_id, *_ = cols[0].split(\"_\")\n                _root_path = os.path.join(root_path, f\"{speaker_name}/{chapter_id}\")\n                wav_file = os.path.join(_root_path, file_name + \".wav\")\n                text = cols[2]\n                # ignore speakers\n                if isinstance(ignored_speakers, list):\n                    if speaker_name in ignored_speakers:\n                        continue\n                items.append(\n                    {\n                        \"text\": text,\n                        \"audio_file\": wav_file,\n                        \"speaker_name\": f\"LTTS_{speaker_name}\",\n                        \"root_path\": root_path,\n                    }\n                )\n    for item in items:\n        assert os.path.exists(item[\"audio_file\"]), f\" [!] wav files don't exist - {item['audio_file']}\"\n    return items\n\n\ndef custom_turkish(root_path, meta_file, **kwargs):  # pylint: disable=unused-argument\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    speaker_name = \"turkish-female\"\n    skipped_files = []\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            cols = line.split(\"|\")\n            wav_file = os.path.join(root_path, \"wavs\", cols[0].strip() + \".wav\")\n            if not os.path.exists(wav_file):\n                skipped_files.append(wav_file)\n                continue\n            text = cols[1].strip()\n            items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name, \"root_path\": root_path})\n    print(f\" [!] {len(skipped_files)} files skipped. They don't exist...\")\n    return items\n\n\n# ToDo: add the dataset link when the dataset is released publicly\ndef brspeech(root_path, meta_file, ignored_speakers=None):\n    \"\"\"BRSpeech 3.0 beta\"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            if line.startswith(\"wav_filename\"):\n                continue\n            cols = line.split(\"|\")\n            wav_file = os.path.join(root_path, cols[0])\n            text = cols[2]\n            speaker_id = cols[3]\n            # ignore speakers\n            if isinstance(ignored_speakers, list):\n                if speaker_id in ignored_speakers:\n                    continue\n            items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_id, \"root_path\": root_path})\n    return items\n\n\ndef vctk(root_path, meta_files=None, wavs_path=\"wav48_silence_trimmed\", mic=\"mic1\", ignored_speakers=None):\n    \"\"\"VCTK dataset v0.92.\n\n    URL:\n        https://datashare.ed.ac.uk/bitstream/handle/10283/3443/VCTK-Corpus-0.92.zip\n\n    This dataset has 2 recordings per speaker that are annotated with ```mic1``` and ```mic2```.\n    It is believed that (😄 ) ```mic1``` files are the same as the previous version of the dataset.\n\n    mic1:\n        Audio recorded using an omni-directional microphone (DPA 4035).\n        Contains very low frequency noises.\n        This is the same audio released in previous versions of VCTK:\n        https://doi.org/10.7488/ds/1994\n\n    mic2:\n        Audio recorded using a small diaphragm condenser microphone with\n        very wide bandwidth (Sennheiser MKH 800).\n        Two speakers, p280 and p315 had technical issues of the audio\n        recordings using MKH 800.\n    \"\"\"\n    file_ext = \"flac\"\n    items = []\n    meta_files = glob(f\"{os.path.join(root_path,'txt')}/**/*.txt\", recursive=True)\n    for meta_file in meta_files:\n        _, speaker_id, txt_file = os.path.relpath(meta_file, root_path).split(os.sep)\n        file_id = txt_file.split(\".\")[0]\n        # ignore speakers\n        if isinstance(ignored_speakers, list):\n            if speaker_id in ignored_speakers:\n                continue\n        with open(meta_file, \"r\", encoding=\"utf-8\") as file_text:\n            text = file_text.readlines()[0]\n        # p280 has no mic2 recordings\n        if speaker_id == \"p280\":\n            wav_file = os.path.join(root_path, wavs_path, speaker_id, file_id + f\"_mic1.{file_ext}\")\n        else:\n            wav_file = os.path.join(root_path, wavs_path, speaker_id, file_id + f\"_{mic}.{file_ext}\")\n        if os.path.exists(wav_file):\n            items.append(\n                {\"text\": text, \"audio_file\": wav_file, \"speaker_name\": \"VCTK_\" + speaker_id, \"root_path\": root_path}\n            )\n        else:\n            print(f\" [!] wav files don't exist - {wav_file}\")\n    return items\n\n\ndef vctk_old(root_path, meta_files=None, wavs_path=\"wav48\", ignored_speakers=None):\n    \"\"\"homepages.inf.ed.ac.uk/jyamagis/release/VCTK-Corpus.tar.gz\"\"\"\n    items = []\n    meta_files = glob(f\"{os.path.join(root_path,'txt')}/**/*.txt\", recursive=True)\n    for meta_file in meta_files:\n        _, speaker_id, txt_file = os.path.relpath(meta_file, root_path).split(os.sep)\n        file_id = txt_file.split(\".\")[0]\n        # ignore speakers\n        if isinstance(ignored_speakers, list):\n            if speaker_id in ignored_speakers:\n                continue\n        with open(meta_file, \"r\", encoding=\"utf-8\") as file_text:\n            text = file_text.readlines()[0]\n        wav_file = os.path.join(root_path, wavs_path, speaker_id, file_id + \".wav\")\n        items.append(\n            {\"text\": text, \"audio_file\": wav_file, \"speaker_name\": \"VCTK_old_\" + speaker_id, \"root_path\": root_path}\n        )\n    return items\n\n\ndef synpaflex(root_path, metafiles=None, **kwargs):  # pylint: disable=unused-argument\n    items = []\n    speaker_name = \"synpaflex\"\n    root_path = os.path.join(root_path, \"\")\n    wav_files = glob(f\"{root_path}**/*.wav\", recursive=True)\n    for wav_file in wav_files:\n        if os.sep + \"wav\" + os.sep in wav_file:\n            txt_file = wav_file.replace(\"wav\", \"txt\")\n        else:\n            txt_file = os.path.join(\n                os.path.dirname(wav_file), \"txt\", os.path.basename(wav_file).replace(\".wav\", \".txt\")\n            )\n        if os.path.exists(txt_file) and os.path.exists(wav_file):\n            with open(txt_file, \"r\", encoding=\"utf-8\") as file_text:\n                text = file_text.readlines()[0]\n            items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name, \"root_path\": root_path})\n    return items\n\n\ndef open_bible(root_path, meta_files=\"train\", ignore_digits_sentences=True, ignored_speakers=None):\n    \"\"\"ToDo: Refer the paper when available\"\"\"\n    items = []\n    split_dir = meta_files\n    meta_files = glob(f\"{os.path.join(root_path, split_dir)}/**/*.txt\", recursive=True)\n    for meta_file in meta_files:\n        _, speaker_id, txt_file = os.path.relpath(meta_file, root_path).split(os.sep)\n        file_id = txt_file.split(\".\")[0]\n        # ignore speakers\n        if isinstance(ignored_speakers, list):\n            if speaker_id in ignored_speakers:\n                continue\n        with open(meta_file, \"r\", encoding=\"utf-8\") as file_text:\n            text = file_text.readline().replace(\"\\n\", \"\")\n        # ignore sentences that contains digits\n        if ignore_digits_sentences and any(map(str.isdigit, text)):\n            continue\n        wav_file = os.path.join(root_path, split_dir, speaker_id, file_id + \".flac\")\n        items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": \"OB_\" + speaker_id, \"root_path\": root_path})\n    return items\n\n\ndef mls(root_path, meta_files=None, ignored_speakers=None):\n    \"\"\"http://www.openslr.org/94/\"\"\"\n    items = []\n    with open(os.path.join(root_path, meta_files), \"r\", encoding=\"utf-8\") as meta:\n        for line in meta:\n            file, text = line.split(\"\\t\")\n            text = text[:-1]\n            speaker, book, *_ = file.split(\"_\")\n            wav_file = os.path.join(root_path, os.path.dirname(meta_files), \"audio\", speaker, book, file + \".wav\")\n            # ignore speakers\n            if isinstance(ignored_speakers, list):\n                if speaker in ignored_speakers:\n                    continue\n            items.append(\n                {\"text\": text, \"audio_file\": wav_file, \"speaker_name\": \"MLS_\" + speaker, \"root_path\": root_path}\n            )\n    return items\n\n\n# ======================================== VOX CELEB ===========================================\ndef voxceleb2(root_path, meta_file=None, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"\n    :param meta_file   Used only for consistency with load_tts_samples api\n    \"\"\"\n    return _voxcel_x(root_path, meta_file, voxcel_idx=\"2\")\n\n\ndef voxceleb1(root_path, meta_file=None, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"\n    :param meta_file   Used only for consistency with load_tts_samples api\n    \"\"\"\n    return _voxcel_x(root_path, meta_file, voxcel_idx=\"1\")\n\n\ndef _voxcel_x(root_path, meta_file, voxcel_idx):\n    assert voxcel_idx in [\"1\", \"2\"]\n    expected_count = 148_000 if voxcel_idx == \"1\" else 1_000_000\n    voxceleb_path = Path(root_path)\n    cache_to = voxceleb_path / f\"metafile_voxceleb{voxcel_idx}.csv\"\n    cache_to.parent.mkdir(exist_ok=True)\n\n    # if not exists meta file, crawl recursively for 'wav' files\n    if meta_file is not None:\n        with open(str(meta_file), \"r\", encoding=\"utf-8\") as f:\n            return [x.strip().split(\"|\") for x in f.readlines()]\n\n    elif not cache_to.exists():\n        cnt = 0\n        meta_data = []\n        wav_files = voxceleb_path.rglob(\"**/*.wav\")\n        for path in tqdm(\n            wav_files,\n            desc=f\"Building VoxCeleb {voxcel_idx} Meta file ... this needs to be done only once.\",\n            total=expected_count,\n        ):\n            speaker_id = str(Path(path).parent.parent.stem)\n            assert speaker_id.startswith(\"id\")\n            text = None  # VoxCel does not provide transciptions, and they are not needed for training the SE\n            meta_data.append(f\"{text}|{path}|voxcel{voxcel_idx}_{speaker_id}\\n\")\n            cnt += 1\n        with open(str(cache_to), \"w\", encoding=\"utf-8\") as f:\n            f.write(\"\".join(meta_data))\n        if cnt < expected_count:\n            raise ValueError(f\"Found too few instances for Voxceleb. Should be around {expected_count}, is: {cnt}\")\n\n    with open(str(cache_to), \"r\", encoding=\"utf-8\") as f:\n        return [x.strip().split(\"|\") for x in f.readlines()]\n\n\ndef emotion(root_path, meta_file, ignored_speakers=None):\n    \"\"\"Generic emotion dataset\"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            if line.startswith(\"file_path\"):\n                continue\n            cols = line.split(\",\")\n            wav_file = os.path.join(root_path, cols[0])\n            speaker_id = cols[1]\n            emotion_id = cols[2].replace(\"\\n\", \"\")\n            # ignore speakers\n            if isinstance(ignored_speakers, list):\n                if speaker_id in ignored_speakers:\n                    continue\n            items.append(\n                {\"audio_file\": wav_file, \"speaker_name\": speaker_id, \"emotion_name\": emotion_id, \"root_path\": root_path}\n            )\n    return items\n\n\ndef baker(root_path: str, meta_file: str, **kwargs) -> List[List[str]]:  # pylint: disable=unused-argument\n    \"\"\"Normalizes the Baker meta data file to TTS format\n\n    Args:\n        root_path (str): path to the baker dataset\n        meta_file (str): name of the meta dataset containing names of wav to select and the transcript of the sentence\n    Returns:\n        List[List[str]]: List of (text, wav_path, speaker_name) associated with each sentences\n    \"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    speaker_name = \"baker\"\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            wav_name, text = line.rstrip(\"\\n\").split(\"|\")\n            wav_path = os.path.join(root_path, \"clips_22\", wav_name)\n            items.append({\"text\": text, \"audio_file\": wav_path, \"speaker_name\": speaker_name, \"root_path\": root_path})\n    return items\n\n\ndef kokoro(root_path, meta_file, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"Japanese single-speaker dataset from https://github.com/kaiidams/Kokoro-Speech-Dataset\"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    speaker_name = \"kokoro\"\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            cols = line.split(\"|\")\n            wav_file = os.path.join(root_path, \"wavs\", cols[0] + \".wav\")\n            text = cols[2].replace(\" \", \"\")\n            items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name, \"root_path\": root_path})\n    return items\n\n\ndef kss(root_path, meta_file, **kwargs):  # pylint: disable=unused-argument\n    \"\"\"Korean single-speaker dataset from https://www.kaggle.com/datasets/bryanpark/korean-single-speaker-speech-dataset\"\"\"\n    txt_file = os.path.join(root_path, meta_file)\n    items = []\n    speaker_name = \"kss\"\n    with open(txt_file, \"r\", encoding=\"utf-8\") as ttf:\n        for line in ttf:\n            cols = line.split(\"|\")\n            wav_file = os.path.join(root_path, cols[0])\n            text = cols[2]  # cols[1] => 6월, cols[2] => 유월\n            items.append({\"text\": text, \"audio_file\": wav_file, \"speaker_name\": speaker_name, \"root_path\": root_path})\n    return items\n"
  },
  {
    "path": "TTS/tts/layers/__init__.py",
    "content": "from TTS.tts.layers.losses import *\n"
  },
  {
    "path": "TTS/tts/layers/align_tts/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/layers/align_tts/duration_predictor.py",
    "content": "from torch import nn\n\nfrom TTS.tts.layers.generic.pos_encoding import PositionalEncoding\nfrom TTS.tts.layers.generic.transformer import FFTransformerBlock\n\n\nclass DurationPredictor(nn.Module):\n    def __init__(self, num_chars, hidden_channels, hidden_channels_ffn, num_heads):\n        super().__init__()\n        self.embed = nn.Embedding(num_chars, hidden_channels)\n        self.pos_enc = PositionalEncoding(hidden_channels, dropout_p=0.1)\n        self.FFT = FFTransformerBlock(hidden_channels, num_heads, hidden_channels_ffn, 2, 0.1)\n        self.out_layer = nn.Conv1d(hidden_channels, 1, 1)\n\n    def forward(self, text, text_lengths):\n        # B, L -> B, L\n        emb = self.embed(text)\n        emb = self.pos_enc(emb.transpose(1, 2))\n        x = self.FFT(emb, text_lengths)\n        x = self.out_layer(x).squeeze(-1)\n        return x\n"
  },
  {
    "path": "TTS/tts/layers/align_tts/mdn.py",
    "content": "from torch import nn\n\n\nclass MDNBlock(nn.Module):\n    \"\"\"Mixture of Density Network implementation\n    https://arxiv.org/pdf/2003.01950.pdf\n    \"\"\"\n\n    def __init__(self, in_channels, out_channels):\n        super().__init__()\n        self.out_channels = out_channels\n        self.conv1 = nn.Conv1d(in_channels, in_channels, 1)\n        self.norm = nn.LayerNorm(in_channels)\n        self.relu = nn.ReLU()\n        self.dropout = nn.Dropout(0.1)\n        self.conv2 = nn.Conv1d(in_channels, out_channels, 1)\n\n    def forward(self, x):\n        o = self.conv1(x)\n        o = o.transpose(1, 2)\n        o = self.norm(o)\n        o = o.transpose(1, 2)\n        o = self.relu(o)\n        o = self.dropout(o)\n        mu_sigma = self.conv2(o)\n        # TODO: check this sigmoid\n        # mu = torch.sigmoid(mu_sigma[:, :self.out_channels//2, :])\n        mu = mu_sigma[:, : self.out_channels // 2, :]\n        log_sigma = mu_sigma[:, self.out_channels // 2 :, :]\n        return mu, log_sigma\n"
  },
  {
    "path": "TTS/tts/layers/feed_forward/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/layers/feed_forward/decoder.py",
    "content": "import torch\nfrom torch import nn\n\nfrom TTS.tts.layers.generic.res_conv_bn import Conv1dBN, Conv1dBNBlock, ResidualConv1dBNBlock\nfrom TTS.tts.layers.generic.transformer import FFTransformerBlock\nfrom TTS.tts.layers.generic.wavenet import WNBlocks\nfrom TTS.tts.layers.glow_tts.transformer import RelativePositionTransformer\n\n\nclass WaveNetDecoder(nn.Module):\n    \"\"\"WaveNet based decoder with a prenet and a postnet.\n\n    prenet: conv1d_1x1\n    postnet: 3 x [conv1d_1x1 -> relu] -> conv1d_1x1\n\n    TODO: Integrate speaker conditioning vector.\n\n    Note:\n        default wavenet parameters;\n            params = {\n                \"num_blocks\": 12,\n                \"hidden_channels\":192,\n                \"kernel_size\": 5,\n                \"dilation_rate\": 1,\n                \"num_layers\": 4,\n                \"dropout_p\": 0.05\n            }\n\n    Args:\n        in_channels (int): number of input channels.\n        out_channels (int): number of output channels.\n        hidden_channels (int): number of hidden channels for prenet and postnet.\n        params (dict): dictionary for residual convolutional blocks.\n    \"\"\"\n\n    def __init__(self, in_channels, out_channels, hidden_channels, c_in_channels, params):\n        super().__init__()\n        # prenet\n        self.prenet = torch.nn.Conv1d(in_channels, params[\"hidden_channels\"], 1)\n        # wavenet layers\n        self.wn = WNBlocks(params[\"hidden_channels\"], c_in_channels=c_in_channels, **params)\n        # postnet\n        self.postnet = [\n            torch.nn.Conv1d(params[\"hidden_channels\"], hidden_channels, 1),\n            torch.nn.ReLU(),\n            torch.nn.Conv1d(hidden_channels, hidden_channels, 1),\n            torch.nn.ReLU(),\n            torch.nn.Conv1d(hidden_channels, hidden_channels, 1),\n            torch.nn.ReLU(),\n            torch.nn.Conv1d(hidden_channels, out_channels, 1),\n        ]\n        self.postnet = nn.Sequential(*self.postnet)\n\n    def forward(self, x, x_mask=None, g=None):\n        x = self.prenet(x) * x_mask\n        x = self.wn(x, x_mask, g)\n        o = self.postnet(x) * x_mask\n        return o\n\n\nclass RelativePositionTransformerDecoder(nn.Module):\n    \"\"\"Decoder with Relative Positional Transformer.\n\n    Note:\n        Default params\n            params={\n                'hidden_channels_ffn': 128,\n                'num_heads': 2,\n                \"kernel_size\": 3,\n                \"dropout_p\": 0.1,\n                \"num_layers\": 8,\n                \"rel_attn_window_size\": 4,\n                \"input_length\": None\n            }\n\n    Args:\n        in_channels (int): number of input channels.\n        out_channels (int): number of output channels.\n        hidden_channels (int): number of hidden channels including Transformer layers.\n        params (dict): dictionary for residual convolutional blocks.\n    \"\"\"\n\n    def __init__(self, in_channels, out_channels, hidden_channels, params):\n        super().__init__()\n        self.prenet = Conv1dBN(in_channels, hidden_channels, 1, 1)\n        self.rel_pos_transformer = RelativePositionTransformer(in_channels, out_channels, hidden_channels, **params)\n\n    def forward(self, x, x_mask=None, g=None):  # pylint: disable=unused-argument\n        o = self.prenet(x) * x_mask\n        o = self.rel_pos_transformer(o, x_mask)\n        return o\n\n\nclass FFTransformerDecoder(nn.Module):\n    \"\"\"Decoder with FeedForwardTransformer.\n\n    Default params\n            params={\n                'hidden_channels_ffn': 1024,\n                'num_heads': 2,\n                \"dropout_p\": 0.1,\n                \"num_layers\": 6,\n            }\n\n    Args:\n        in_channels (int): number of input channels.\n        out_channels (int): number of output channels.\n        hidden_channels (int): number of hidden channels including Transformer layers.\n        params (dict): dictionary for residual convolutional blocks.\n    \"\"\"\n\n    def __init__(self, in_channels, out_channels, params):\n        super().__init__()\n        self.transformer_block = FFTransformerBlock(in_channels, **params)\n        self.postnet = nn.Conv1d(in_channels, out_channels, 1)\n\n    def forward(self, x, x_mask=None, g=None):  # pylint: disable=unused-argument\n        # TODO: handle multi-speaker\n        x_mask = 1 if x_mask is None else x_mask\n        o = self.transformer_block(x) * x_mask\n        o = self.postnet(o) * x_mask\n        return o\n\n\nclass ResidualConv1dBNDecoder(nn.Module):\n    \"\"\"Residual Convolutional Decoder as in the original Speedy Speech paper\n\n    TODO: Integrate speaker conditioning vector.\n\n    Note:\n        Default params\n                params = {\n                    \"kernel_size\": 4,\n                    \"dilations\": 4 * [1, 2, 4, 8] + [1],\n                    \"num_conv_blocks\": 2,\n                    \"num_res_blocks\": 17\n                }\n\n    Args:\n        in_channels (int): number of input channels.\n        out_channels (int): number of output channels.\n        hidden_channels (int): number of hidden channels including ResidualConv1dBNBlock layers.\n        params (dict): dictionary for residual convolutional blocks.\n    \"\"\"\n\n    def __init__(self, in_channels, out_channels, hidden_channels, params):\n        super().__init__()\n        self.res_conv_block = ResidualConv1dBNBlock(in_channels, hidden_channels, hidden_channels, **params)\n        self.post_conv = nn.Conv1d(hidden_channels, hidden_channels, 1)\n        self.postnet = nn.Sequential(\n            Conv1dBNBlock(\n                hidden_channels, hidden_channels, hidden_channels, params[\"kernel_size\"], 1, num_conv_blocks=2\n            ),\n            nn.Conv1d(hidden_channels, out_channels, 1),\n        )\n\n    def forward(self, x, x_mask=None, g=None):  # pylint: disable=unused-argument\n        o = self.res_conv_block(x, x_mask)\n        o = self.post_conv(o) + x\n        return self.postnet(o) * x_mask\n\n\nclass Decoder(nn.Module):\n    \"\"\"Decodes the expanded phoneme encoding into spectrograms\n    Args:\n        out_channels (int): number of output channels.\n        in_hidden_channels (int): input and hidden channels. Model keeps the input channels for the intermediate layers.\n        decoder_type (str): decoder layer types. 'transformers' or 'residual_conv_bn'. Default 'residual_conv_bn'.\n        decoder_params (dict): model parameters for specified decoder type.\n        c_in_channels (int): number of channels for conditional input.\n\n    Shapes:\n        - input: (B, C, T)\n    \"\"\"\n\n    # pylint: disable=dangerous-default-value\n    def __init__(\n        self,\n        out_channels,\n        in_hidden_channels,\n        decoder_type=\"residual_conv_bn\",\n        decoder_params={\n            \"kernel_size\": 4,\n            \"dilations\": 4 * [1, 2, 4, 8] + [1],\n            \"num_conv_blocks\": 2,\n            \"num_res_blocks\": 17,\n        },\n        c_in_channels=0,\n    ):\n        super().__init__()\n\n        if decoder_type.lower() == \"relative_position_transformer\":\n            self.decoder = RelativePositionTransformerDecoder(\n                in_channels=in_hidden_channels,\n                out_channels=out_channels,\n                hidden_channels=in_hidden_channels,\n                params=decoder_params,\n            )\n        elif decoder_type.lower() == \"residual_conv_bn\":\n            self.decoder = ResidualConv1dBNDecoder(\n                in_channels=in_hidden_channels,\n                out_channels=out_channels,\n                hidden_channels=in_hidden_channels,\n                params=decoder_params,\n            )\n        elif decoder_type.lower() == \"wavenet\":\n            self.decoder = WaveNetDecoder(\n                in_channels=in_hidden_channels,\n                out_channels=out_channels,\n                hidden_channels=in_hidden_channels,\n                c_in_channels=c_in_channels,\n                params=decoder_params,\n            )\n        elif decoder_type.lower() == \"fftransformer\":\n            self.decoder = FFTransformerDecoder(in_hidden_channels, out_channels, decoder_params)\n        else:\n            raise ValueError(f\"[!] Unknown decoder type - {decoder_type}\")\n\n    def forward(self, x, x_mask, g=None):  # pylint: disable=unused-argument\n        \"\"\"\n        Args:\n            x: [B, C, T]\n            x_mask: [B, 1, T]\n            g: [B, C_g, 1]\n        \"\"\"\n        # TODO: implement multi-speaker\n        o = self.decoder(x, x_mask, g)\n        return o\n"
  },
  {
    "path": "TTS/tts/layers/feed_forward/duration_predictor.py",
    "content": "from torch import nn\n\nfrom TTS.tts.layers.generic.res_conv_bn import Conv1dBN\n\n\nclass DurationPredictor(nn.Module):\n    \"\"\"Speedy Speech duration predictor model.\n    Predicts phoneme durations from encoder outputs.\n\n    Note:\n        Outputs interpreted as log(durations)\n        To get actual durations, do exp transformation\n\n    conv_BN_4x1 -> conv_BN_3x1 -> conv_BN_1x1 -> conv_1x1\n\n    Args:\n        hidden_channels (int): number of channels in the inner layers.\n    \"\"\"\n\n    def __init__(self, hidden_channels):\n        super().__init__()\n\n        self.layers = nn.ModuleList(\n            [\n                Conv1dBN(hidden_channels, hidden_channels, 4, 1),\n                Conv1dBN(hidden_channels, hidden_channels, 3, 1),\n                Conv1dBN(hidden_channels, hidden_channels, 1, 1),\n                nn.Conv1d(hidden_channels, 1, 1),\n            ]\n        )\n\n    def forward(self, x, x_mask):\n        \"\"\"\n        Shapes:\n            x: [B, C, T]\n            x_mask: [B, 1, T]\n        \"\"\"\n        o = x\n        for layer in self.layers:\n            o = layer(o) * x_mask\n        return o\n"
  },
  {
    "path": "TTS/tts/layers/feed_forward/encoder.py",
    "content": "from torch import nn\n\nfrom TTS.tts.layers.generic.res_conv_bn import ResidualConv1dBNBlock\nfrom TTS.tts.layers.generic.transformer import FFTransformerBlock\nfrom TTS.tts.layers.glow_tts.transformer import RelativePositionTransformer\n\n\nclass RelativePositionTransformerEncoder(nn.Module):\n    \"\"\"Speedy speech encoder built on Transformer with Relative Position encoding.\n\n    TODO: Integrate speaker conditioning vector.\n\n    Args:\n        in_channels (int): number of input channels.\n        out_channels (int): number of output channels.\n        hidden_channels (int): number of hidden channels\n        params (dict): dictionary for residual convolutional blocks.\n    \"\"\"\n\n    def __init__(self, in_channels, out_channels, hidden_channels, params):\n        super().__init__()\n        self.prenet = ResidualConv1dBNBlock(\n            in_channels,\n            hidden_channels,\n            hidden_channels,\n            kernel_size=5,\n            num_res_blocks=3,\n            num_conv_blocks=1,\n            dilations=[1, 1, 1],\n        )\n        self.rel_pos_transformer = RelativePositionTransformer(hidden_channels, out_channels, hidden_channels, **params)\n\n    def forward(self, x, x_mask=None, g=None):  # pylint: disable=unused-argument\n        if x_mask is None:\n            x_mask = 1\n        o = self.prenet(x) * x_mask\n        o = self.rel_pos_transformer(o, x_mask)\n        return o\n\n\nclass ResidualConv1dBNEncoder(nn.Module):\n    \"\"\"Residual Convolutional Encoder as in the original Speedy Speech paper\n\n    TODO: Integrate speaker conditioning vector.\n\n    Args:\n        in_channels (int): number of input channels.\n        out_channels (int): number of output channels.\n        hidden_channels (int): number of hidden channels\n        params (dict): dictionary for residual convolutional blocks.\n    \"\"\"\n\n    def __init__(self, in_channels, out_channels, hidden_channels, params):\n        super().__init__()\n        self.prenet = nn.Sequential(nn.Conv1d(in_channels, hidden_channels, 1), nn.ReLU())\n        self.res_conv_block = ResidualConv1dBNBlock(hidden_channels, hidden_channels, hidden_channels, **params)\n\n        self.postnet = nn.Sequential(\n            *[\n                nn.Conv1d(hidden_channels, hidden_channels, 1),\n                nn.ReLU(),\n                nn.BatchNorm1d(hidden_channels),\n                nn.Conv1d(hidden_channels, out_channels, 1),\n            ]\n        )\n\n    def forward(self, x, x_mask=None, g=None):  # pylint: disable=unused-argument\n        if x_mask is None:\n            x_mask = 1\n        o = self.prenet(x) * x_mask\n        o = self.res_conv_block(o, x_mask)\n        o = self.postnet(o + x) * x_mask\n        return o * x_mask\n\n\nclass Encoder(nn.Module):\n    # pylint: disable=dangerous-default-value\n    \"\"\"Factory class for Speedy Speech encoder enables different encoder types internally.\n\n    Args:\n        num_chars (int): number of characters.\n        out_channels (int): number of output channels.\n        in_hidden_channels (int): input and hidden channels. Model keeps the input channels for the intermediate layers.\n        encoder_type (str): encoder layer types. 'transformers' or 'residual_conv_bn'. Default 'residual_conv_bn'.\n        encoder_params (dict): model parameters for specified encoder type.\n        c_in_channels (int): number of channels for conditional input.\n\n    Note:\n        Default encoder_params to be set in config.json...\n\n        ```python\n        # for 'relative_position_transformer'\n        encoder_params={\n            'hidden_channels_ffn': 128,\n            'num_heads': 2,\n            \"kernel_size\": 3,\n            \"dropout_p\": 0.1,\n            \"num_layers\": 6,\n            \"rel_attn_window_size\": 4,\n            \"input_length\": None\n        },\n\n        # for 'residual_conv_bn'\n        encoder_params = {\n            \"kernel_size\": 4,\n            \"dilations\": 4 * [1, 2, 4] + [1],\n            \"num_conv_blocks\": 2,\n            \"num_res_blocks\": 13\n        }\n\n        # for 'fftransformer'\n        encoder_params = {\n            \"hidden_channels_ffn\": 1024 ,\n            \"num_heads\": 2,\n            \"num_layers\": 6,\n            \"dropout_p\": 0.1\n        }\n        ```\n    \"\"\"\n\n    def __init__(\n        self,\n        in_hidden_channels,\n        out_channels,\n        encoder_type=\"residual_conv_bn\",\n        encoder_params={\"kernel_size\": 4, \"dilations\": 4 * [1, 2, 4] + [1], \"num_conv_blocks\": 2, \"num_res_blocks\": 13},\n        c_in_channels=0,\n    ):\n        super().__init__()\n        self.out_channels = out_channels\n        self.in_channels = in_hidden_channels\n        self.hidden_channels = in_hidden_channels\n        self.encoder_type = encoder_type\n        self.c_in_channels = c_in_channels\n\n        # init encoder\n        if encoder_type.lower() == \"relative_position_transformer\":\n            # text encoder\n            # pylint: disable=unexpected-keyword-arg\n            self.encoder = RelativePositionTransformerEncoder(\n                in_hidden_channels, out_channels, in_hidden_channels, encoder_params\n            )\n        elif encoder_type.lower() == \"residual_conv_bn\":\n            self.encoder = ResidualConv1dBNEncoder(in_hidden_channels, out_channels, in_hidden_channels, encoder_params)\n        elif encoder_type.lower() == \"fftransformer\":\n            assert (\n                in_hidden_channels == out_channels\n            ), \"[!] must be `in_channels` == `out_channels` when encoder type is 'fftransformer'\"\n            # pylint: disable=unexpected-keyword-arg\n            self.encoder = FFTransformerBlock(in_hidden_channels, **encoder_params)\n        else:\n            raise NotImplementedError(\" [!] unknown encoder type.\")\n\n    def forward(self, x, x_mask, g=None):  # pylint: disable=unused-argument\n        \"\"\"\n        Shapes:\n            x: [B, C, T]\n            x_mask: [B, 1, T]\n            g: [B, C, 1]\n        \"\"\"\n        o = self.encoder(x, x_mask)\n        return o * x_mask\n"
  },
  {
    "path": "TTS/tts/layers/generic/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/layers/generic/aligner.py",
    "content": "from typing import Tuple\n\nimport torch\nfrom torch import nn\n\n\nclass AlignmentNetwork(torch.nn.Module):\n    \"\"\"Aligner Network for learning alignment between the input text and the model output with Gaussian Attention.\n\n    ::\n\n        query -> conv1d -> relu -> conv1d -> relu -> conv1d -> L2_dist -> softmax -> alignment\n        key   -> conv1d -> relu -> conv1d -----------------------^\n\n    Args:\n        in_query_channels (int): Number of channels in the query network. Defaults to 80.\n        in_key_channels (int): Number of channels in the key network. Defaults to 512.\n        attn_channels (int): Number of inner channels in the attention layers. Defaults to 80.\n        temperature (float): Temperature for the softmax. Defaults to 0.0005.\n    \"\"\"\n\n    def __init__(\n        self,\n        in_query_channels=80,\n        in_key_channels=512,\n        attn_channels=80,\n        temperature=0.0005,\n    ):\n        super().__init__()\n        self.temperature = temperature\n        self.softmax = torch.nn.Softmax(dim=3)\n        self.log_softmax = torch.nn.LogSoftmax(dim=3)\n\n        self.key_layer = nn.Sequential(\n            nn.Conv1d(\n                in_key_channels,\n                in_key_channels * 2,\n                kernel_size=3,\n                padding=1,\n                bias=True,\n            ),\n            torch.nn.ReLU(),\n            nn.Conv1d(in_key_channels * 2, attn_channels, kernel_size=1, padding=0, bias=True),\n        )\n\n        self.query_layer = nn.Sequential(\n            nn.Conv1d(\n                in_query_channels,\n                in_query_channels * 2,\n                kernel_size=3,\n                padding=1,\n                bias=True,\n            ),\n            torch.nn.ReLU(),\n            nn.Conv1d(in_query_channels * 2, in_query_channels, kernel_size=1, padding=0, bias=True),\n            torch.nn.ReLU(),\n            nn.Conv1d(in_query_channels, attn_channels, kernel_size=1, padding=0, bias=True),\n        )\n\n    def forward(\n        self, queries: torch.tensor, keys: torch.tensor, mask: torch.tensor = None, attn_prior: torch.tensor = None\n    ) -> Tuple[torch.tensor, torch.tensor]:\n        \"\"\"Forward pass of the aligner encoder.\n        Shapes:\n            - queries: :math:`[B, C, T_de]`\n            - keys: :math:`[B, C_emb, T_en]`\n            - mask: :math:`[B, T_de]`\n        Output:\n            attn (torch.tensor): :math:`[B, 1, T_en, T_de]` soft attention mask.\n            attn_logp (torch.tensor): :math:`[ßB, 1, T_en , T_de]` log probabilities.\n        \"\"\"\n        key_out = self.key_layer(keys)\n        query_out = self.query_layer(queries)\n        attn_factor = (query_out[:, :, :, None] - key_out[:, :, None]) ** 2\n        attn_logp = -self.temperature * attn_factor.sum(1, keepdim=True)\n        if attn_prior is not None:\n            attn_logp = self.log_softmax(attn_logp) + torch.log(attn_prior[:, None] + 1e-8)\n        if mask is not None:\n            attn_logp.data.masked_fill_(~mask.bool().unsqueeze(2), -float(\"inf\"))\n        attn = self.softmax(attn_logp)\n        return attn, attn_logp\n"
  },
  {
    "path": "TTS/tts/layers/generic/gated_conv.py",
    "content": "from torch import nn\n\nfrom .normalization import LayerNorm\n\n\nclass GatedConvBlock(nn.Module):\n    \"\"\"Gated convolutional block as in https://arxiv.org/pdf/1612.08083.pdf\n    Args:\n        in_out_channels (int): number of input/output channels.\n        kernel_size (int): convolution kernel size.\n        dropout_p (float): dropout rate.\n    \"\"\"\n\n    def __init__(self, in_out_channels, kernel_size, dropout_p, num_layers):\n        super().__init__()\n        # class arguments\n        self.dropout_p = dropout_p\n        self.num_layers = num_layers\n        # define layers\n        self.conv_layers = nn.ModuleList()\n        self.norm_layers = nn.ModuleList()\n        self.layers = nn.ModuleList()\n        for _ in range(num_layers):\n            self.conv_layers += [nn.Conv1d(in_out_channels, 2 * in_out_channels, kernel_size, padding=kernel_size // 2)]\n            self.norm_layers += [LayerNorm(2 * in_out_channels)]\n\n    def forward(self, x, x_mask):\n        o = x\n        res = x\n        for idx in range(self.num_layers):\n            o = nn.functional.dropout(o, p=self.dropout_p, training=self.training)\n            o = self.conv_layers[idx](o * x_mask)\n            o = self.norm_layers[idx](o)\n            o = nn.functional.glu(o, dim=1)\n            o = res + o\n            res = o\n        return o\n"
  },
  {
    "path": "TTS/tts/layers/generic/normalization.py",
    "content": "import torch\nfrom torch import nn\n\n\nclass LayerNorm(nn.Module):\n    def __init__(self, channels, eps=1e-4):\n        \"\"\"Layer norm for the 2nd dimension of the input.\n        Args:\n            channels (int): number of channels (2nd dimension) of the input.\n            eps (float): to prevent 0 division\n\n        Shapes:\n            - input: (B, C, T)\n            - output: (B, C, T)\n        \"\"\"\n        super().__init__()\n        self.channels = channels\n        self.eps = eps\n\n        self.gamma = nn.Parameter(torch.ones(1, channels, 1) * 0.1)\n        self.beta = nn.Parameter(torch.zeros(1, channels, 1))\n\n    def forward(self, x):\n        mean = torch.mean(x, 1, keepdim=True)\n        variance = torch.mean((x - mean) ** 2, 1, keepdim=True)\n        x = (x - mean) * torch.rsqrt(variance + self.eps)\n        x = x * self.gamma + self.beta\n        return x\n\n\nclass LayerNorm2(nn.Module):\n    \"\"\"Layer norm for the 2nd dimension of the input using torch primitive.\n    Args:\n        channels (int): number of channels (2nd dimension) of the input.\n        eps (float): to prevent 0 division\n\n    Shapes:\n        - input: (B, C, T)\n        - output: (B, C, T)\n    \"\"\"\n\n    def __init__(self, channels, eps=1e-5):\n        super().__init__()\n        self.channels = channels\n        self.eps = eps\n\n        self.gamma = nn.Parameter(torch.ones(channels))\n        self.beta = nn.Parameter(torch.zeros(channels))\n\n    def forward(self, x):\n        x = x.transpose(1, -1)\n        x = torch.nn.functional.layer_norm(x, (self.channels,), self.gamma, self.beta, self.eps)\n        return x.transpose(1, -1)\n\n\nclass TemporalBatchNorm1d(nn.BatchNorm1d):\n    \"\"\"Normalize each channel separately over time and batch.\"\"\"\n\n    def __init__(self, channels, affine=True, track_running_stats=True, momentum=0.1):\n        super().__init__(channels, affine=affine, track_running_stats=track_running_stats, momentum=momentum)\n\n    def forward(self, x):\n        return super().forward(x.transpose(2, 1)).transpose(2, 1)\n\n\nclass ActNorm(nn.Module):\n    \"\"\"Activation Normalization bijector as an alternative to Batch Norm. It computes\n    mean and std from a sample data in advance and it uses these values\n    for normalization at training.\n\n    Args:\n        channels (int): input channels.\n        ddi (False): data depended initialization flag.\n\n    Shapes:\n        - inputs: (B, C, T)\n        - outputs: (B, C, T)\n    \"\"\"\n\n    def __init__(self, channels, ddi=False, **kwargs):  # pylint: disable=unused-argument\n        super().__init__()\n        self.channels = channels\n        self.initialized = not ddi\n\n        self.logs = nn.Parameter(torch.zeros(1, channels, 1))\n        self.bias = nn.Parameter(torch.zeros(1, channels, 1))\n\n    def forward(self, x, x_mask=None, reverse=False, **kwargs):  # pylint: disable=unused-argument\n        if x_mask is None:\n            x_mask = torch.ones(x.size(0), 1, x.size(2)).to(device=x.device, dtype=x.dtype)\n        x_len = torch.sum(x_mask, [1, 2])\n        if not self.initialized:\n            self.initialize(x, x_mask)\n            self.initialized = True\n\n        if reverse:\n            z = (x - self.bias) * torch.exp(-self.logs) * x_mask\n            logdet = None\n        else:\n            z = (self.bias + torch.exp(self.logs) * x) * x_mask\n            logdet = torch.sum(self.logs) * x_len  # [b]\n\n        return z, logdet\n\n    def store_inverse(self):\n        pass\n\n    def set_ddi(self, ddi):\n        self.initialized = not ddi\n\n    def initialize(self, x, x_mask):\n        with torch.no_grad():\n            denom = torch.sum(x_mask, [0, 2])\n            m = torch.sum(x * x_mask, [0, 2]) / denom\n            m_sq = torch.sum(x * x * x_mask, [0, 2]) / denom\n            v = m_sq - (m**2)\n            logs = 0.5 * torch.log(torch.clamp_min(v, 1e-6))\n\n            bias_init = (-m * torch.exp(-logs)).view(*self.bias.shape).to(dtype=self.bias.dtype)\n            logs_init = (-logs).view(*self.logs.shape).to(dtype=self.logs.dtype)\n\n            self.bias.data.copy_(bias_init)\n            self.logs.data.copy_(logs_init)\n"
  },
  {
    "path": "TTS/tts/layers/generic/pos_encoding.py",
    "content": "import math\n\nimport torch\nfrom torch import nn\n\n\nclass PositionalEncoding(nn.Module):\n    \"\"\"Sinusoidal positional encoding for non-recurrent neural networks.\n    Implementation based on \"Attention Is All You Need\"\n\n    Args:\n       channels (int): embedding size\n       dropout_p (float): dropout rate applied to the output.\n       max_len (int): maximum sequence length.\n       use_scale (bool): whether to use a learnable scaling coefficient.\n    \"\"\"\n\n    def __init__(self, channels, dropout_p=0.0, max_len=5000, use_scale=False):\n        super().__init__()\n        if channels % 2 != 0:\n            raise ValueError(\n                \"Cannot use sin/cos positional encoding with \" \"odd channels (got channels={:d})\".format(channels)\n            )\n        self.use_scale = use_scale\n        if use_scale:\n            self.scale = torch.nn.Parameter(torch.ones(1))\n        pe = torch.zeros(max_len, channels)\n        position = torch.arange(0, max_len).unsqueeze(1)\n        div_term = torch.pow(10000, torch.arange(0, channels, 2).float() / channels)\n        pe[:, 0::2] = torch.sin(position.float() * div_term)\n        pe[:, 1::2] = torch.cos(position.float() * div_term)\n        pe = pe.unsqueeze(0).transpose(1, 2)\n        self.register_buffer(\"pe\", pe)\n        if dropout_p > 0:\n            self.dropout = nn.Dropout(p=dropout_p)\n        self.channels = channels\n\n    def forward(self, x, mask=None, first_idx=None, last_idx=None):\n        \"\"\"\n        Shapes:\n            x: [B, C, T]\n            mask: [B, 1, T]\n            first_idx: int\n            last_idx: int\n        \"\"\"\n\n        x = x * math.sqrt(self.channels)\n        if first_idx is None:\n            if self.pe.size(2) < x.size(2):\n                raise RuntimeError(\n                    f\"Sequence is {x.size(2)} but PositionalEncoding is\"\n                    f\" limited to {self.pe.size(2)}. See max_len argument.\"\n                )\n            if mask is not None:\n                pos_enc = self.pe[:, :, : x.size(2)] * mask\n            else:\n                pos_enc = self.pe[:, :, : x.size(2)]\n            if self.use_scale:\n                x = x + self.scale * pos_enc\n            else:\n                x = x + pos_enc\n        else:\n            if self.use_scale:\n                x = x + self.scale * self.pe[:, :, first_idx:last_idx]\n            else:\n                x = x + self.pe[:, :, first_idx:last_idx]\n        if hasattr(self, \"dropout\"):\n            x = self.dropout(x)\n        return x\n"
  },
  {
    "path": "TTS/tts/layers/generic/res_conv_bn.py",
    "content": "from torch import nn\n\n\nclass ZeroTemporalPad(nn.Module):\n    \"\"\"Pad sequences to equal lentgh in the temporal dimension\"\"\"\n\n    def __init__(self, kernel_size, dilation):\n        super().__init__()\n        total_pad = dilation * (kernel_size - 1)\n        begin = total_pad // 2\n        end = total_pad - begin\n        self.pad_layer = nn.ZeroPad2d((0, 0, begin, end))\n\n    def forward(self, x):\n        return self.pad_layer(x)\n\n\nclass Conv1dBN(nn.Module):\n    \"\"\"1d convolutional with batch norm.\n    conv1d -> relu -> BN blocks.\n\n    Note:\n        Batch normalization is applied after ReLU regarding the original implementation.\n\n    Args:\n        in_channels (int): number of input channels.\n        out_channels (int): number of output channels.\n        kernel_size (int): kernel size for convolutional filters.\n        dilation (int): dilation for convolution layers.\n    \"\"\"\n\n    def __init__(self, in_channels, out_channels, kernel_size, dilation):\n        super().__init__()\n        padding = dilation * (kernel_size - 1)\n        pad_s = padding // 2\n        pad_e = padding - pad_s\n        self.conv1d = nn.Conv1d(in_channels, out_channels, kernel_size, dilation=dilation)\n        self.pad = nn.ZeroPad2d((pad_s, pad_e, 0, 0))  # uneven left and right padding\n        self.norm = nn.BatchNorm1d(out_channels)\n\n    def forward(self, x):\n        o = self.conv1d(x)\n        o = self.pad(o)\n        o = nn.functional.relu(o)\n        o = self.norm(o)\n        return o\n\n\nclass Conv1dBNBlock(nn.Module):\n    \"\"\"1d convolutional block with batch norm. It is a set of conv1d -> relu -> BN blocks.\n\n    Args:\n        in_channels (int): number of input channels.\n        out_channels (int): number of output channels.\n        hidden_channels (int): number of inner convolution channels.\n        kernel_size (int): kernel size for convolutional filters.\n        dilation (int): dilation for convolution layers.\n        num_conv_blocks (int, optional): number of convolutional blocks. Defaults to 2.\n    \"\"\"\n\n    def __init__(self, in_channels, out_channels, hidden_channels, kernel_size, dilation, num_conv_blocks=2):\n        super().__init__()\n        self.conv_bn_blocks = []\n        for idx in range(num_conv_blocks):\n            layer = Conv1dBN(\n                in_channels if idx == 0 else hidden_channels,\n                out_channels if idx == (num_conv_blocks - 1) else hidden_channels,\n                kernel_size,\n                dilation,\n            )\n            self.conv_bn_blocks.append(layer)\n        self.conv_bn_blocks = nn.Sequential(*self.conv_bn_blocks)\n\n    def forward(self, x):\n        \"\"\"\n        Shapes:\n            x: (B, D, T)\n        \"\"\"\n        return self.conv_bn_blocks(x)\n\n\nclass ResidualConv1dBNBlock(nn.Module):\n    \"\"\"Residual Convolutional Blocks with BN\n    Each block has 'num_conv_block' conv layers and 'num_res_blocks' such blocks are connected\n    with residual connections.\n\n    conv_block = (conv1d -> relu -> bn) x 'num_conv_blocks'\n    residuak_conv_block =  (x -> conv_block ->  + ->) x 'num_res_blocks'\n                            ' - - - - - - - - - ^\n    Args:\n        in_channels (int): number of input channels.\n        out_channels (int): number of output channels.\n        hidden_channels (int): number of inner convolution channels.\n        kernel_size (int): kernel size for convolutional filters.\n        dilations (list): dilations for each convolution layer.\n        num_res_blocks (int, optional): number of residual blocks. Defaults to 13.\n        num_conv_blocks (int, optional): number of convolutional blocks in each residual block. Defaults to 2.\n    \"\"\"\n\n    def __init__(\n        self, in_channels, out_channels, hidden_channels, kernel_size, dilations, num_res_blocks=13, num_conv_blocks=2\n    ):\n        super().__init__()\n        assert len(dilations) == num_res_blocks\n        self.res_blocks = nn.ModuleList()\n        for idx, dilation in enumerate(dilations):\n            block = Conv1dBNBlock(\n                in_channels if idx == 0 else hidden_channels,\n                out_channels if (idx + 1) == len(dilations) else hidden_channels,\n                hidden_channels,\n                kernel_size,\n                dilation,\n                num_conv_blocks,\n            )\n            self.res_blocks.append(block)\n\n    def forward(self, x, x_mask=None):\n        if x_mask is None:\n            x_mask = 1.0\n        o = x * x_mask\n        for block in self.res_blocks:\n            res = o\n            o = block(o)\n            o = o + res\n            if x_mask is not None:\n                o = o * x_mask\n        return o\n"
  },
  {
    "path": "TTS/tts/layers/generic/time_depth_sep_conv.py",
    "content": "import torch\nfrom torch import nn\n\n\nclass TimeDepthSeparableConv(nn.Module):\n    \"\"\"Time depth separable convolution as in https://arxiv.org/pdf/1904.02619.pdf\n    It shows competative results with less computation and memory footprint.\"\"\"\n\n    def __init__(self, in_channels, hid_channels, out_channels, kernel_size, bias=True):\n        super().__init__()\n\n        self.in_channels = in_channels\n        self.out_channels = out_channels\n        self.hid_channels = hid_channels\n        self.kernel_size = kernel_size\n\n        self.time_conv = nn.Conv1d(\n            in_channels,\n            2 * hid_channels,\n            kernel_size=1,\n            stride=1,\n            padding=0,\n            bias=bias,\n        )\n        self.norm1 = nn.BatchNorm1d(2 * hid_channels)\n        self.depth_conv = nn.Conv1d(\n            hid_channels,\n            hid_channels,\n            kernel_size,\n            stride=1,\n            padding=(kernel_size - 1) // 2,\n            groups=hid_channels,\n            bias=bias,\n        )\n        self.norm2 = nn.BatchNorm1d(hid_channels)\n        self.time_conv2 = nn.Conv1d(\n            hid_channels,\n            out_channels,\n            kernel_size=1,\n            stride=1,\n            padding=0,\n            bias=bias,\n        )\n        self.norm3 = nn.BatchNorm1d(out_channels)\n\n    def forward(self, x):\n        x_res = x\n        x = self.time_conv(x)\n        x = self.norm1(x)\n        x = nn.functional.glu(x, dim=1)\n        x = self.depth_conv(x)\n        x = self.norm2(x)\n        x = x * torch.sigmoid(x)\n        x = self.time_conv2(x)\n        x = self.norm3(x)\n        x = x_res + x\n        return x\n\n\nclass TimeDepthSeparableConvBlock(nn.Module):\n    def __init__(self, in_channels, hid_channels, out_channels, num_layers, kernel_size, bias=True):\n        super().__init__()\n        assert (kernel_size - 1) % 2 == 0\n        assert num_layers > 1\n\n        self.layers = nn.ModuleList()\n        layer = TimeDepthSeparableConv(\n            in_channels, hid_channels, out_channels if num_layers == 1 else hid_channels, kernel_size, bias\n        )\n        self.layers.append(layer)\n        for idx in range(num_layers - 1):\n            layer = TimeDepthSeparableConv(\n                hid_channels,\n                hid_channels,\n                out_channels if (idx + 1) == (num_layers - 1) else hid_channels,\n                kernel_size,\n                bias,\n            )\n            self.layers.append(layer)\n\n    def forward(self, x, mask):\n        for layer in self.layers:\n            x = layer(x * mask)\n        return x\n"
  },
  {
    "path": "TTS/tts/layers/generic/transformer.py",
    "content": "import torch\nimport torch.nn.functional as F\nfrom torch import nn\n\n\nclass FFTransformer(nn.Module):\n    def __init__(self, in_out_channels, num_heads, hidden_channels_ffn=1024, kernel_size_fft=3, dropout_p=0.1):\n        super().__init__()\n        self.self_attn = nn.MultiheadAttention(in_out_channels, num_heads, dropout=dropout_p)\n\n        padding = (kernel_size_fft - 1) // 2\n        self.conv1 = nn.Conv1d(in_out_channels, hidden_channels_ffn, kernel_size=kernel_size_fft, padding=padding)\n        self.conv2 = nn.Conv1d(hidden_channels_ffn, in_out_channels, kernel_size=kernel_size_fft, padding=padding)\n\n        self.norm1 = nn.LayerNorm(in_out_channels)\n        self.norm2 = nn.LayerNorm(in_out_channels)\n\n        self.dropout1 = nn.Dropout(dropout_p)\n        self.dropout2 = nn.Dropout(dropout_p)\n\n    def forward(self, src, src_mask=None, src_key_padding_mask=None):\n        \"\"\"😦 ugly looking with all the transposing\"\"\"\n        src = src.permute(2, 0, 1)\n        src2, enc_align = self.self_attn(src, src, src, attn_mask=src_mask, key_padding_mask=src_key_padding_mask)\n        src = src + self.dropout1(src2)\n        src = self.norm1(src + src2)\n        # T x B x D -> B x D x T\n        src = src.permute(1, 2, 0)\n        src2 = self.conv2(F.relu(self.conv1(src)))\n        src2 = self.dropout2(src2)\n        src = src + src2\n        src = src.transpose(1, 2)\n        src = self.norm2(src)\n        src = src.transpose(1, 2)\n        return src, enc_align\n\n\nclass FFTransformerBlock(nn.Module):\n    def __init__(self, in_out_channels, num_heads, hidden_channels_ffn, num_layers, dropout_p):\n        super().__init__()\n        self.fft_layers = nn.ModuleList(\n            [\n                FFTransformer(\n                    in_out_channels=in_out_channels,\n                    num_heads=num_heads,\n                    hidden_channels_ffn=hidden_channels_ffn,\n                    dropout_p=dropout_p,\n                )\n                for _ in range(num_layers)\n            ]\n        )\n\n    def forward(self, x, mask=None, g=None):  # pylint: disable=unused-argument\n        \"\"\"\n        TODO: handle multi-speaker\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - mask:  :math:`[B, 1, T] or [B, T]`\n        \"\"\"\n        if mask is not None and mask.ndim == 3:\n            mask = mask.squeeze(1)\n            # mask is negated, torch uses 1s and 0s reversely.\n            mask = ~mask.bool()\n        alignments = []\n        for layer in self.fft_layers:\n            x, align = layer(x, src_key_padding_mask=mask)\n            alignments.append(align.unsqueeze(1))\n        alignments = torch.cat(alignments, 1)\n        return x\n\n\nclass FFTDurationPredictor:\n    def __init__(\n        self, in_channels, hidden_channels, num_heads, num_layers, dropout_p=0.1, cond_channels=None\n    ):  # pylint: disable=unused-argument\n        self.fft = FFTransformerBlock(in_channels, num_heads, hidden_channels, num_layers, dropout_p)\n        self.proj = nn.Linear(in_channels, 1)\n\n    def forward(self, x, mask=None, g=None):  # pylint: disable=unused-argument\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - mask:  :math:`[B, 1, T]`\n\n        TODO: Handle the cond input\n        \"\"\"\n        x = self.fft(x, mask=mask)\n        x = self.proj(x)\n        return x\n"
  },
  {
    "path": "TTS/tts/layers/generic/wavenet.py",
    "content": "import torch\nfrom torch import nn\n\n\n@torch.jit.script\ndef fused_add_tanh_sigmoid_multiply(input_a, input_b, n_channels):\n    n_channels_int = n_channels[0]\n    in_act = input_a + input_b\n    t_act = torch.tanh(in_act[:, :n_channels_int, :])\n    s_act = torch.sigmoid(in_act[:, n_channels_int:, :])\n    acts = t_act * s_act\n    return acts\n\n\nclass WN(torch.nn.Module):\n    \"\"\"Wavenet layers with weight norm and no input conditioning.\n\n         |-----------------------------------------------------------------------------|\n         |                                    |-> tanh    -|                           |\n    res -|- conv1d(dilation) -> dropout -> + -|            * -> conv1d1x1 -> split -|- + -> res\n    g -------------------------------------|  |-> sigmoid -|                        |\n    o --------------------------------------------------------------------------- + --------- o\n\n    Args:\n        in_channels (int): number of input channels.\n        hidden_channes (int): number of hidden channels.\n        kernel_size (int): filter kernel size for the first conv layer.\n        dilation_rate (int): dilations rate to increase dilation per layer.\n            If it is 2, dilations are 1, 2, 4, 8 for the next 4 layers.\n        num_layers (int): number of wavenet layers.\n        c_in_channels (int): number of channels of conditioning input.\n        dropout_p (float): dropout rate.\n        weight_norm (bool): enable/disable weight norm for convolution layers.\n    \"\"\"\n\n    def __init__(\n        self,\n        in_channels,\n        hidden_channels,\n        kernel_size,\n        dilation_rate,\n        num_layers,\n        c_in_channels=0,\n        dropout_p=0,\n        weight_norm=True,\n    ):\n        super().__init__()\n        assert kernel_size % 2 == 1\n        assert hidden_channels % 2 == 0\n        self.in_channels = in_channels\n        self.hidden_channels = hidden_channels\n        self.kernel_size = kernel_size\n        self.dilation_rate = dilation_rate\n        self.num_layers = num_layers\n        self.c_in_channels = c_in_channels\n        self.dropout_p = dropout_p\n\n        self.in_layers = torch.nn.ModuleList()\n        self.res_skip_layers = torch.nn.ModuleList()\n        self.dropout = nn.Dropout(dropout_p)\n\n        # init conditioning layer\n        if c_in_channels > 0:\n            cond_layer = torch.nn.Conv1d(c_in_channels, 2 * hidden_channels * num_layers, 1)\n            self.cond_layer = torch.nn.utils.weight_norm(cond_layer, name=\"weight\")\n        # intermediate layers\n        for i in range(num_layers):\n            dilation = dilation_rate**i\n            padding = int((kernel_size * dilation - dilation) / 2)\n            if i == 0:\n                in_layer = torch.nn.Conv1d(\n                    in_channels, 2 * hidden_channels, kernel_size, dilation=dilation, padding=padding\n                )\n            else:\n                in_layer = torch.nn.Conv1d(\n                    hidden_channels, 2 * hidden_channels, kernel_size, dilation=dilation, padding=padding\n                )\n            in_layer = torch.nn.utils.weight_norm(in_layer, name=\"weight\")\n            self.in_layers.append(in_layer)\n\n            if i < num_layers - 1:\n                res_skip_channels = 2 * hidden_channels\n            else:\n                res_skip_channels = hidden_channels\n\n            res_skip_layer = torch.nn.Conv1d(hidden_channels, res_skip_channels, 1)\n            res_skip_layer = torch.nn.utils.weight_norm(res_skip_layer, name=\"weight\")\n            self.res_skip_layers.append(res_skip_layer)\n        # setup weight norm\n        if not weight_norm:\n            self.remove_weight_norm()\n\n    def forward(self, x, x_mask=None, g=None, **kwargs):  # pylint: disable=unused-argument\n        output = torch.zeros_like(x)\n        n_channels_tensor = torch.IntTensor([self.hidden_channels])\n        x_mask = 1.0 if x_mask is None else x_mask\n        if g is not None:\n            g = self.cond_layer(g)\n        for i in range(self.num_layers):\n            x_in = self.in_layers[i](x)\n            x_in = self.dropout(x_in)\n            if g is not None:\n                cond_offset = i * 2 * self.hidden_channels\n                g_l = g[:, cond_offset : cond_offset + 2 * self.hidden_channels, :]\n            else:\n                g_l = torch.zeros_like(x_in)\n            acts = fused_add_tanh_sigmoid_multiply(x_in, g_l, n_channels_tensor)\n            res_skip_acts = self.res_skip_layers[i](acts)\n            if i < self.num_layers - 1:\n                x = (x + res_skip_acts[:, : self.hidden_channels, :]) * x_mask\n                output = output + res_skip_acts[:, self.hidden_channels :, :]\n            else:\n                output = output + res_skip_acts\n        return output * x_mask\n\n    def remove_weight_norm(self):\n        if self.c_in_channels != 0:\n            torch.nn.utils.remove_weight_norm(self.cond_layer)\n        for l in self.in_layers:\n            torch.nn.utils.remove_weight_norm(l)\n        for l in self.res_skip_layers:\n            torch.nn.utils.remove_weight_norm(l)\n\n\nclass WNBlocks(nn.Module):\n    \"\"\"Wavenet blocks.\n\n    Note: After each block dilation resets to 1 and it increases in each block\n        along the dilation rate.\n\n    Args:\n        in_channels (int): number of input channels.\n        hidden_channes (int): number of hidden channels.\n        kernel_size (int): filter kernel size for the first conv layer.\n        dilation_rate (int): dilations rate to increase dilation per layer.\n            If it is 2, dilations are 1, 2, 4, 8 for the next 4 layers.\n        num_blocks (int): number of wavenet blocks.\n        num_layers (int): number of wavenet layers.\n        c_in_channels (int): number of channels of conditioning input.\n        dropout_p (float): dropout rate.\n        weight_norm (bool): enable/disable weight norm for convolution layers.\n    \"\"\"\n\n    def __init__(\n        self,\n        in_channels,\n        hidden_channels,\n        kernel_size,\n        dilation_rate,\n        num_blocks,\n        num_layers,\n        c_in_channels=0,\n        dropout_p=0,\n        weight_norm=True,\n    ):\n        super().__init__()\n        self.wn_blocks = nn.ModuleList()\n        for idx in range(num_blocks):\n            layer = WN(\n                in_channels=in_channels if idx == 0 else hidden_channels,\n                hidden_channels=hidden_channels,\n                kernel_size=kernel_size,\n                dilation_rate=dilation_rate,\n                num_layers=num_layers,\n                c_in_channels=c_in_channels,\n                dropout_p=dropout_p,\n                weight_norm=weight_norm,\n            )\n            self.wn_blocks.append(layer)\n\n    def forward(self, x, x_mask=None, g=None):\n        o = x\n        for layer in self.wn_blocks:\n            o = layer(o, x_mask, g)\n        return o\n"
  },
  {
    "path": "TTS/tts/layers/glow_tts/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/layers/glow_tts/decoder.py",
    "content": "import torch\nfrom torch import nn\n\nfrom TTS.tts.layers.generic.normalization import ActNorm\nfrom TTS.tts.layers.glow_tts.glow import CouplingBlock, InvConvNear\n\n\ndef squeeze(x, x_mask=None, num_sqz=2):\n    \"\"\"GlowTTS squeeze operation\n    Increase number of channels and reduce number of time steps\n    by the same factor.\n\n    Note:\n        each 's' is a n-dimensional vector.\n        ``[s1,s2,s3,s4,s5,s6] --> [[s1, s3, s5], [s2, s4, s6]]``\n    \"\"\"\n    b, c, t = x.size()\n\n    t = (t // num_sqz) * num_sqz\n    x = x[:, :, :t]\n    x_sqz = x.view(b, c, t // num_sqz, num_sqz)\n    x_sqz = x_sqz.permute(0, 3, 1, 2).contiguous().view(b, c * num_sqz, t // num_sqz)\n\n    if x_mask is not None:\n        x_mask = x_mask[:, :, num_sqz - 1 :: num_sqz]\n    else:\n        x_mask = torch.ones(b, 1, t // num_sqz).to(device=x.device, dtype=x.dtype)\n    return x_sqz * x_mask, x_mask\n\n\ndef unsqueeze(x, x_mask=None, num_sqz=2):\n    \"\"\"GlowTTS unsqueeze operation (revert the squeeze)\n\n    Note:\n        each 's' is a n-dimensional vector.\n        ``[[s1, s3, s5], [s2, s4, s6]] --> [[s1, s3, s5, s2, s4, s6]]``\n    \"\"\"\n    b, c, t = x.size()\n\n    x_unsqz = x.view(b, num_sqz, c // num_sqz, t)\n    x_unsqz = x_unsqz.permute(0, 2, 3, 1).contiguous().view(b, c // num_sqz, t * num_sqz)\n\n    if x_mask is not None:\n        x_mask = x_mask.unsqueeze(-1).repeat(1, 1, 1, num_sqz).view(b, 1, t * num_sqz)\n    else:\n        x_mask = torch.ones(b, 1, t * num_sqz).to(device=x.device, dtype=x.dtype)\n    return x_unsqz * x_mask, x_mask\n\n\nclass Decoder(nn.Module):\n    \"\"\"Stack of Glow Decoder Modules.\n\n    ::\n\n        Squeeze -> ActNorm -> InvertibleConv1x1 -> AffineCoupling -> Unsqueeze\n\n    Args:\n        in_channels (int): channels of input tensor.\n        hidden_channels (int): hidden decoder channels.\n        kernel_size (int): Coupling block kernel size. (Wavenet filter kernel size.)\n        dilation_rate (int): rate to increase dilation by each layer in a decoder block.\n        num_flow_blocks (int): number of decoder blocks.\n        num_coupling_layers (int): number coupling layers. (number of wavenet layers.)\n        dropout_p (float): wavenet dropout rate.\n        sigmoid_scale (bool): enable/disable sigmoid scaling in coupling layer.\n    \"\"\"\n\n    def __init__(\n        self,\n        in_channels,\n        hidden_channels,\n        kernel_size,\n        dilation_rate,\n        num_flow_blocks,\n        num_coupling_layers,\n        dropout_p=0.0,\n        num_splits=4,\n        num_squeeze=2,\n        sigmoid_scale=False,\n        c_in_channels=0,\n    ):\n        super().__init__()\n\n        self.in_channels = in_channels\n        self.hidden_channels = hidden_channels\n        self.kernel_size = kernel_size\n        self.dilation_rate = dilation_rate\n        self.num_flow_blocks = num_flow_blocks\n        self.num_coupling_layers = num_coupling_layers\n        self.dropout_p = dropout_p\n        self.num_splits = num_splits\n        self.num_squeeze = num_squeeze\n        self.sigmoid_scale = sigmoid_scale\n        self.c_in_channels = c_in_channels\n\n        self.flows = nn.ModuleList()\n        for _ in range(num_flow_blocks):\n            self.flows.append(ActNorm(channels=in_channels * num_squeeze))\n            self.flows.append(InvConvNear(channels=in_channels * num_squeeze, num_splits=num_splits))\n            self.flows.append(\n                CouplingBlock(\n                    in_channels * num_squeeze,\n                    hidden_channels,\n                    kernel_size=kernel_size,\n                    dilation_rate=dilation_rate,\n                    num_layers=num_coupling_layers,\n                    c_in_channels=c_in_channels,\n                    dropout_p=dropout_p,\n                    sigmoid_scale=sigmoid_scale,\n                )\n            )\n\n    def forward(self, x, x_mask, g=None, reverse=False):\n        \"\"\"\n        Shapes:\n            - x:  :math:`[B, C, T]`\n            - x_mask: :math:`[B, 1 ,T]`\n            - g: :math:`[B, C]`\n        \"\"\"\n        if not reverse:\n            flows = self.flows\n            logdet_tot = 0\n        else:\n            flows = reversed(self.flows)\n            logdet_tot = None\n\n        if self.num_squeeze > 1:\n            x, x_mask = squeeze(x, x_mask, self.num_squeeze)\n        for f in flows:\n            if not reverse:\n                x, logdet = f(x, x_mask, g=g, reverse=reverse)\n                logdet_tot += logdet\n            else:\n                x, logdet = f(x, x_mask, g=g, reverse=reverse)\n        if self.num_squeeze > 1:\n            x, x_mask = unsqueeze(x, x_mask, self.num_squeeze)\n        return x, logdet_tot\n\n    def store_inverse(self):\n        for f in self.flows:\n            f.store_inverse()\n"
  },
  {
    "path": "TTS/tts/layers/glow_tts/duration_predictor.py",
    "content": "import torch\nfrom torch import nn\n\nfrom ..generic.normalization import LayerNorm\n\n\nclass DurationPredictor(nn.Module):\n    \"\"\"Glow-TTS duration prediction model.\n\n    ::\n\n        [2 x (conv1d_kxk -> relu -> layer_norm -> dropout)] -> conv1d_1x1 -> durs\n\n    Args:\n        in_channels (int): Number of channels of the input tensor.\n        hidden_channels (int): Number of hidden channels of the network.\n        kernel_size (int): Kernel size for the conv layers.\n        dropout_p (float): Dropout rate used after each conv layer.\n    \"\"\"\n\n    def __init__(self, in_channels, hidden_channels, kernel_size, dropout_p, cond_channels=None, language_emb_dim=None):\n        super().__init__()\n\n        # add language embedding dim in the input\n        if language_emb_dim:\n            in_channels += language_emb_dim\n\n        # class arguments\n        self.in_channels = in_channels\n        self.filter_channels = hidden_channels\n        self.kernel_size = kernel_size\n        self.dropout_p = dropout_p\n        # layers\n        self.drop = nn.Dropout(dropout_p)\n        self.conv_1 = nn.Conv1d(in_channels, hidden_channels, kernel_size, padding=kernel_size // 2)\n        self.norm_1 = LayerNorm(hidden_channels)\n        self.conv_2 = nn.Conv1d(hidden_channels, hidden_channels, kernel_size, padding=kernel_size // 2)\n        self.norm_2 = LayerNorm(hidden_channels)\n        # output layer\n        self.proj = nn.Conv1d(hidden_channels, 1, 1)\n        if cond_channels is not None and cond_channels != 0:\n            self.cond = nn.Conv1d(cond_channels, in_channels, 1)\n\n        if language_emb_dim != 0 and language_emb_dim is not None:\n            self.cond_lang = nn.Conv1d(language_emb_dim, in_channels, 1)\n\n    def forward(self, x, x_mask, g=None, lang_emb=None):\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - x_mask: :math:`[B, 1, T]`\n            - g: :math:`[B, C, 1]`\n        \"\"\"\n        if g is not None:\n            x = x + self.cond(g)\n\n        if lang_emb is not None:\n            x = x + self.cond_lang(lang_emb)\n\n        x = self.conv_1(x * x_mask)\n        x = torch.relu(x)\n        x = self.norm_1(x)\n        x = self.drop(x)\n        x = self.conv_2(x * x_mask)\n        x = torch.relu(x)\n        x = self.norm_2(x)\n        x = self.drop(x)\n        x = self.proj(x * x_mask)\n        return x * x_mask\n"
  },
  {
    "path": "TTS/tts/layers/glow_tts/encoder.py",
    "content": "import math\n\nimport torch\nfrom torch import nn\n\nfrom TTS.tts.layers.generic.gated_conv import GatedConvBlock\nfrom TTS.tts.layers.generic.res_conv_bn import ResidualConv1dBNBlock\nfrom TTS.tts.layers.generic.time_depth_sep_conv import TimeDepthSeparableConvBlock\nfrom TTS.tts.layers.glow_tts.duration_predictor import DurationPredictor\nfrom TTS.tts.layers.glow_tts.glow import ResidualConv1dLayerNormBlock\nfrom TTS.tts.layers.glow_tts.transformer import RelativePositionTransformer\nfrom TTS.tts.utils.helpers import sequence_mask\n\n\nclass Encoder(nn.Module):\n    \"\"\"Glow-TTS encoder module.\n\n    ::\n\n        embedding -> <prenet> -> encoder_module -> <postnet> --> proj_mean\n                                                             |\n                                                             |-> proj_var\n                                                             |\n                                                             |-> concat -> duration_predictor\n                                                                    ↑\n                                                              speaker_embed\n\n    Args:\n        num_chars (int): number of characters.\n        out_channels (int): number of output channels.\n        hidden_channels (int): encoder's embedding size.\n        hidden_channels_ffn (int): transformer's feed-forward channels.\n        kernel_size (int): kernel size for conv layers and duration predictor.\n        dropout_p (float): dropout rate for any dropout layer.\n        mean_only (bool): if True, output only mean values and use constant std.\n        use_prenet (bool): if True, use pre-convolutional layers before transformer layers.\n        c_in_channels (int): number of channels in conditional input.\n\n    Shapes:\n        - input: (B, T, C)\n\n    ::\n\n        suggested encoder params...\n\n        for encoder_type == 'rel_pos_transformer'\n            encoder_params={\n                'kernel_size':3,\n                'dropout_p': 0.1,\n                'num_layers': 6,\n                'num_heads': 2,\n                'hidden_channels_ffn': 768,  # 4 times the hidden_channels\n                'input_length': None\n            }\n\n        for encoder_type == 'gated_conv'\n            encoder_params={\n                'kernel_size':5,\n                'dropout_p': 0.1,\n                'num_layers': 9,\n            }\n\n        for encoder_type == 'residual_conv_bn'\n            encoder_params={\n                \"kernel_size\": 4,\n                \"dilations\": [1, 2, 4, 1, 2, 4, 1, 2, 4, 1, 2, 4, 1],\n                \"num_conv_blocks\": 2,\n                \"num_res_blocks\": 13\n            }\n\n         for encoder_type == 'time_depth_separable'\n            encoder_params={\n                \"kernel_size\": 5,\n                'num_layers': 9,\n            }\n    \"\"\"\n\n    def __init__(\n        self,\n        num_chars,\n        out_channels,\n        hidden_channels,\n        hidden_channels_dp,\n        encoder_type,\n        encoder_params,\n        dropout_p_dp=0.1,\n        mean_only=False,\n        use_prenet=True,\n        c_in_channels=0,\n    ):\n        super().__init__()\n        # class arguments\n        self.num_chars = num_chars\n        self.out_channels = out_channels\n        self.hidden_channels = hidden_channels\n        self.hidden_channels_dp = hidden_channels_dp\n        self.dropout_p_dp = dropout_p_dp\n        self.mean_only = mean_only\n        self.use_prenet = use_prenet\n        self.c_in_channels = c_in_channels\n        self.encoder_type = encoder_type\n        # embedding layer\n        self.emb = nn.Embedding(num_chars, hidden_channels)\n        nn.init.normal_(self.emb.weight, 0.0, hidden_channels**-0.5)\n        # init encoder module\n        if encoder_type.lower() == \"rel_pos_transformer\":\n            if use_prenet:\n                self.prenet = ResidualConv1dLayerNormBlock(\n                    hidden_channels, hidden_channels, hidden_channels, kernel_size=5, num_layers=3, dropout_p=0.5\n                )\n            self.encoder = RelativePositionTransformer(\n                hidden_channels, hidden_channels, hidden_channels, **encoder_params\n            )\n        elif encoder_type.lower() == \"gated_conv\":\n            self.encoder = GatedConvBlock(hidden_channels, **encoder_params)\n        elif encoder_type.lower() == \"residual_conv_bn\":\n            if use_prenet:\n                self.prenet = nn.Sequential(nn.Conv1d(hidden_channels, hidden_channels, 1), nn.ReLU())\n            self.encoder = ResidualConv1dBNBlock(hidden_channels, hidden_channels, hidden_channels, **encoder_params)\n            self.postnet = nn.Sequential(\n                nn.Conv1d(self.hidden_channels, self.hidden_channels, 1), nn.BatchNorm1d(self.hidden_channels)\n            )\n        elif encoder_type.lower() == \"time_depth_separable\":\n            if use_prenet:\n                self.prenet = ResidualConv1dLayerNormBlock(\n                    hidden_channels, hidden_channels, hidden_channels, kernel_size=5, num_layers=3, dropout_p=0.5\n                )\n            self.encoder = TimeDepthSeparableConvBlock(\n                hidden_channels, hidden_channels, hidden_channels, **encoder_params\n            )\n        else:\n            raise ValueError(\" [!] Unkown encoder type.\")\n\n        # final projection layers\n        self.proj_m = nn.Conv1d(hidden_channels, out_channels, 1)\n        if not mean_only:\n            self.proj_s = nn.Conv1d(hidden_channels, out_channels, 1)\n        # duration predictor\n        self.duration_predictor = DurationPredictor(\n            hidden_channels + c_in_channels, hidden_channels_dp, 3, dropout_p_dp\n        )\n\n    def forward(self, x, x_lengths, g=None):\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - x_lengths: :math:`[B]`\n            - g (optional): :math:`[B, 1, T]`\n        \"\"\"\n        # embedding layer\n        # [B ,T, D]\n        x = self.emb(x) * math.sqrt(self.hidden_channels)\n        # [B, D, T]\n        x = torch.transpose(x, 1, -1)\n        # compute input sequence mask\n        x_mask = torch.unsqueeze(sequence_mask(x_lengths, x.size(2)), 1).to(x.dtype)\n        # prenet\n        if hasattr(self, \"prenet\") and self.use_prenet:\n            x = self.prenet(x, x_mask)\n        # encoder\n        x = self.encoder(x, x_mask)\n        # postnet\n        if hasattr(self, \"postnet\"):\n            x = self.postnet(x) * x_mask\n        # set duration predictor input\n        if g is not None:\n            g_exp = g.expand(-1, -1, x.size(-1))\n            x_dp = torch.cat([x.detach(), g_exp], 1)\n        else:\n            x_dp = x.detach()\n        # final projection layer\n        x_m = self.proj_m(x) * x_mask\n        if not self.mean_only:\n            x_logs = self.proj_s(x) * x_mask\n        else:\n            x_logs = torch.zeros_like(x_m)\n        # duration predictor\n        logw = self.duration_predictor(x_dp, x_mask)\n        return x_m, x_logs, logw, x_mask\n"
  },
  {
    "path": "TTS/tts/layers/glow_tts/glow.py",
    "content": "import torch\nfrom packaging.version import Version\nfrom torch import nn\nfrom torch.nn import functional as F\n\nfrom TTS.tts.layers.generic.wavenet import WN\n\nfrom ..generic.normalization import LayerNorm\n\n\nclass ResidualConv1dLayerNormBlock(nn.Module):\n    \"\"\"Conv1d with Layer Normalization and residual connection as in GlowTTS paper.\n    https://arxiv.org/pdf/1811.00002.pdf\n\n    ::\n\n        x |-> conv1d -> layer_norm -> relu -> dropout -> + -> o\n          |---------------> conv1d_1x1 ------------------|\n\n    Args:\n        in_channels (int): number of input tensor channels.\n        hidden_channels (int): number of inner layer channels.\n        out_channels (int): number of output tensor channels.\n        kernel_size (int): kernel size of conv1d filter.\n        num_layers (int): number of blocks.\n        dropout_p (float): dropout rate for each block.\n    \"\"\"\n\n    def __init__(self, in_channels, hidden_channels, out_channels, kernel_size, num_layers, dropout_p):\n        super().__init__()\n        self.in_channels = in_channels\n        self.hidden_channels = hidden_channels\n        self.out_channels = out_channels\n        self.kernel_size = kernel_size\n        self.num_layers = num_layers\n        self.dropout_p = dropout_p\n        assert num_layers > 1, \" [!] number of layers should be > 0.\"\n        assert kernel_size % 2 == 1, \" [!] kernel size should be odd number.\"\n\n        self.conv_layers = nn.ModuleList()\n        self.norm_layers = nn.ModuleList()\n\n        for idx in range(num_layers):\n            self.conv_layers.append(\n                nn.Conv1d(\n                    in_channels if idx == 0 else hidden_channels, hidden_channels, kernel_size, padding=kernel_size // 2\n                )\n            )\n            self.norm_layers.append(LayerNorm(hidden_channels))\n\n        self.proj = nn.Conv1d(hidden_channels, out_channels, 1)\n        self.proj.weight.data.zero_()\n        self.proj.bias.data.zero_()\n\n    def forward(self, x, x_mask):\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - x_mask: :math:`[B, 1, T]`\n        \"\"\"\n        x_res = x\n        for i in range(self.num_layers):\n            x = self.conv_layers[i](x * x_mask)\n            x = self.norm_layers[i](x * x_mask)\n            x = F.dropout(F.relu(x), self.dropout_p, training=self.training)\n        x = x_res + self.proj(x)\n        return x * x_mask\n\n\nclass InvConvNear(nn.Module):\n    \"\"\"Invertible Convolution with input splitting as in GlowTTS paper.\n    https://arxiv.org/pdf/1811.00002.pdf\n\n    Args:\n        channels (int): input and output channels.\n        num_splits (int): number of splits, also H and W of conv layer.\n        no_jacobian (bool): enable/disable jacobian computations.\n\n    Note:\n        Split the input into groups of size self.num_splits and\n        perform 1x1 convolution separately. Cast 1x1 conv operation\n        to 2d by reshaping the input for efficiency.\n    \"\"\"\n\n    def __init__(self, channels, num_splits=4, no_jacobian=False, **kwargs):  # pylint: disable=unused-argument\n        super().__init__()\n        assert num_splits % 2 == 0\n        self.channels = channels\n        self.num_splits = num_splits\n        self.no_jacobian = no_jacobian\n        self.weight_inv = None\n\n        if Version(torch.__version__) < Version(\"1.9\"):\n            w_init = torch.qr(torch.FloatTensor(self.num_splits, self.num_splits).normal_())[0]\n        else:\n            w_init = torch.linalg.qr(torch.FloatTensor(self.num_splits, self.num_splits).normal_(), \"complete\")[0]\n\n        if torch.det(w_init) < 0:\n            w_init[:, 0] = -1 * w_init[:, 0]\n        self.weight = nn.Parameter(w_init)\n\n    def forward(self, x, x_mask=None, reverse=False, **kwargs):  # pylint: disable=unused-argument\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - x_mask: :math:`[B, 1, T]`\n        \"\"\"\n        b, c, t = x.size()\n        assert c % self.num_splits == 0\n        if x_mask is None:\n            x_mask = 1\n            x_len = torch.ones((b,), dtype=x.dtype, device=x.device) * t\n        else:\n            x_len = torch.sum(x_mask, [1, 2])\n\n        x = x.view(b, 2, c // self.num_splits, self.num_splits // 2, t)\n        x = x.permute(0, 1, 3, 2, 4).contiguous().view(b, self.num_splits, c // self.num_splits, t)\n\n        if reverse:\n            if self.weight_inv is not None:\n                weight = self.weight_inv\n            else:\n                weight = torch.inverse(self.weight.float()).to(dtype=self.weight.dtype)\n            logdet = None\n        else:\n            weight = self.weight\n            if self.no_jacobian:\n                logdet = 0\n            else:\n                logdet = torch.logdet(self.weight) * (c / self.num_splits) * x_len  # [b]\n\n        weight = weight.view(self.num_splits, self.num_splits, 1, 1)\n        z = F.conv2d(x, weight)\n\n        z = z.view(b, 2, self.num_splits // 2, c // self.num_splits, t)\n        z = z.permute(0, 1, 3, 2, 4).contiguous().view(b, c, t) * x_mask\n        return z, logdet\n\n    def store_inverse(self):\n        weight_inv = torch.inverse(self.weight.float()).to(dtype=self.weight.dtype)\n        self.weight_inv = nn.Parameter(weight_inv, requires_grad=False)\n\n\nclass CouplingBlock(nn.Module):\n    \"\"\"Glow Affine Coupling block as in GlowTTS paper.\n    https://arxiv.org/pdf/1811.00002.pdf\n\n    ::\n\n        x --> x0 -> conv1d -> wavenet -> conv1d --> t, s -> concat(s*x1 + t, x0) -> o\n        '-> x1 - - - - - - - - - - - - - - - - - - - - - - - - - ^\n\n    Args:\n         in_channels (int): number of input tensor channels.\n         hidden_channels (int): number of hidden channels.\n         kernel_size (int): WaveNet filter kernel size.\n         dilation_rate (int): rate to increase dilation by each layer in a decoder block.\n         num_layers (int): number of WaveNet layers.\n         c_in_channels (int): number of conditioning input channels.\n         dropout_p (int): wavenet dropout rate.\n         sigmoid_scale (bool): enable/disable sigmoid scaling for output scale.\n\n    Note:\n         It does not use the conditional inputs differently from WaveGlow.\n    \"\"\"\n\n    def __init__(\n        self,\n        in_channels,\n        hidden_channels,\n        kernel_size,\n        dilation_rate,\n        num_layers,\n        c_in_channels=0,\n        dropout_p=0,\n        sigmoid_scale=False,\n    ):\n        super().__init__()\n        self.in_channels = in_channels\n        self.hidden_channels = hidden_channels\n        self.kernel_size = kernel_size\n        self.dilation_rate = dilation_rate\n        self.num_layers = num_layers\n        self.c_in_channels = c_in_channels\n        self.dropout_p = dropout_p\n        self.sigmoid_scale = sigmoid_scale\n        # input layer\n        start = torch.nn.Conv1d(in_channels // 2, hidden_channels, 1)\n        start = torch.nn.utils.weight_norm(start)\n        self.start = start\n        # output layer\n        # Initializing last layer to 0 makes the affine coupling layers\n        # do nothing at first.  This helps with training stability\n        end = torch.nn.Conv1d(hidden_channels, in_channels, 1)\n        end.weight.data.zero_()\n        end.bias.data.zero_()\n        self.end = end\n        # coupling layers\n        self.wn = WN(hidden_channels, hidden_channels, kernel_size, dilation_rate, num_layers, c_in_channels, dropout_p)\n\n    def forward(self, x, x_mask=None, reverse=False, g=None, **kwargs):  # pylint: disable=unused-argument\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - x_mask: :math:`[B, 1, T]`\n            - g: :math:`[B, C, 1]`\n        \"\"\"\n        if x_mask is None:\n            x_mask = 1\n        x_0, x_1 = x[:, : self.in_channels // 2], x[:, self.in_channels // 2 :]\n\n        x = self.start(x_0) * x_mask\n        x = self.wn(x, x_mask, g)\n        out = self.end(x)\n\n        z_0 = x_0\n        t = out[:, : self.in_channels // 2, :]\n        s = out[:, self.in_channels // 2 :, :]\n        if self.sigmoid_scale:\n            s = torch.log(1e-6 + torch.sigmoid(s + 2))\n\n        if reverse:\n            z_1 = (x_1 - t) * torch.exp(-s) * x_mask\n            logdet = None\n        else:\n            z_1 = (t + torch.exp(s) * x_1) * x_mask\n            logdet = torch.sum(s * x_mask, [1, 2])\n\n        z = torch.cat([z_0, z_1], 1)\n        return z, logdet\n\n    def store_inverse(self):\n        self.wn.remove_weight_norm()\n"
  },
  {
    "path": "TTS/tts/layers/glow_tts/transformer.py",
    "content": "import math\n\nimport torch\nfrom torch import nn\nfrom torch.nn import functional as F\n\nfrom TTS.tts.layers.generic.normalization import LayerNorm, LayerNorm2\n\n\nclass RelativePositionMultiHeadAttention(nn.Module):\n    \"\"\"Multi-head attention with Relative Positional embedding.\n    https://arxiv.org/pdf/1809.04281.pdf\n\n    It learns positional embeddings for a window of neighbours. For keys and values,\n    it learns different set of embeddings. Key embeddings are agregated with the attention\n    scores and value embeddings are aggregated with the output.\n\n    Note:\n        Example with relative attention window size 2\n\n        - input = [a, b, c, d, e]\n        - rel_attn_embeddings = [e(t-2), e(t-1), e(t+1), e(t+2)]\n\n        So it learns 4 embedding vectors (in total 8) separately for key and value vectors.\n\n        Considering the input c\n\n        - e(t-2) corresponds to c -> a\n        - e(t-2) corresponds to c -> b\n        - e(t-2) corresponds to c -> d\n        - e(t-2) corresponds to c -> e\n\n        These embeddings are shared among different time steps. So input a, b, d and e also uses\n        the same embeddings.\n\n        Embeddings are ignored when the relative window is out of limit for the first and the last\n        n items.\n\n    Args:\n        channels (int): input and inner layer channels.\n        out_channels (int): output channels.\n        num_heads (int): number of attention heads.\n        rel_attn_window_size (int, optional): relation attention window size.\n            If 4, for each time step next and previous 4 time steps are attended.\n            If default, relative encoding is disabled and it is a regular transformer.\n            Defaults to None.\n        heads_share (bool, optional): [description]. Defaults to True.\n        dropout_p (float, optional): dropout rate. Defaults to 0..\n        input_length (int, optional): intput length for positional encoding. Defaults to None.\n        proximal_bias (bool, optional): enable/disable proximal bias as in the paper. Defaults to False.\n        proximal_init (bool, optional): enable/disable poximal init as in the paper.\n            Init key and query layer weights the same. Defaults to False.\n    \"\"\"\n\n    def __init__(\n        self,\n        channels,\n        out_channels,\n        num_heads,\n        rel_attn_window_size=None,\n        heads_share=True,\n        dropout_p=0.0,\n        input_length=None,\n        proximal_bias=False,\n        proximal_init=False,\n    ):\n        super().__init__()\n        assert channels % num_heads == 0, \" [!] channels should be divisible by num_heads.\"\n        # class attributes\n        self.channels = channels\n        self.out_channels = out_channels\n        self.num_heads = num_heads\n        self.rel_attn_window_size = rel_attn_window_size\n        self.heads_share = heads_share\n        self.input_length = input_length\n        self.proximal_bias = proximal_bias\n        self.dropout_p = dropout_p\n        self.attn = None\n        # query, key, value layers\n        self.k_channels = channels // num_heads\n        self.conv_q = nn.Conv1d(channels, channels, 1)\n        self.conv_k = nn.Conv1d(channels, channels, 1)\n        self.conv_v = nn.Conv1d(channels, channels, 1)\n        # output layers\n        self.conv_o = nn.Conv1d(channels, out_channels, 1)\n        self.dropout = nn.Dropout(dropout_p)\n        # relative positional encoding layers\n        if rel_attn_window_size is not None:\n            n_heads_rel = 1 if heads_share else num_heads\n            rel_stddev = self.k_channels**-0.5\n            emb_rel_k = nn.Parameter(\n                torch.randn(n_heads_rel, rel_attn_window_size * 2 + 1, self.k_channels) * rel_stddev\n            )\n            emb_rel_v = nn.Parameter(\n                torch.randn(n_heads_rel, rel_attn_window_size * 2 + 1, self.k_channels) * rel_stddev\n            )\n            self.register_parameter(\"emb_rel_k\", emb_rel_k)\n            self.register_parameter(\"emb_rel_v\", emb_rel_v)\n\n        # init layers\n        nn.init.xavier_uniform_(self.conv_q.weight)\n        nn.init.xavier_uniform_(self.conv_k.weight)\n        # proximal bias\n        if proximal_init:\n            self.conv_k.weight.data.copy_(self.conv_q.weight.data)\n            self.conv_k.bias.data.copy_(self.conv_q.bias.data)\n        nn.init.xavier_uniform_(self.conv_v.weight)\n\n    def forward(self, x, c, attn_mask=None):\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - c: :math:`[B, C, T]`\n            - attn_mask: :math:`[B, 1, T, T]`\n        \"\"\"\n        q = self.conv_q(x)\n        k = self.conv_k(c)\n        v = self.conv_v(c)\n        x, self.attn = self.attention(q, k, v, mask=attn_mask)\n        x = self.conv_o(x)\n        return x\n\n    def attention(self, query, key, value, mask=None):\n        # reshape [b, d, t] -> [b, n_h, t, d_k]\n        b, d, t_s, t_t = (*key.size(), query.size(2))\n        query = query.view(b, self.num_heads, self.k_channels, t_t).transpose(2, 3)\n        key = key.view(b, self.num_heads, self.k_channels, t_s).transpose(2, 3)\n        value = value.view(b, self.num_heads, self.k_channels, t_s).transpose(2, 3)\n        # compute raw attention scores\n        scores = torch.matmul(query, key.transpose(-2, -1)) / math.sqrt(self.k_channels)\n        # relative positional encoding for scores\n        if self.rel_attn_window_size is not None:\n            assert t_s == t_t, \"Relative attention is only available for self-attention.\"\n            # get relative key embeddings\n            key_relative_embeddings = self._get_relative_embeddings(self.emb_rel_k, t_s)\n            rel_logits = self._matmul_with_relative_keys(query, key_relative_embeddings)\n            rel_logits = self._relative_position_to_absolute_position(rel_logits)\n            scores_local = rel_logits / math.sqrt(self.k_channels)\n            scores = scores + scores_local\n        # proximan bias\n        if self.proximal_bias:\n            assert t_s == t_t, \"Proximal bias is only available for self-attention.\"\n            scores = scores + self._attn_proximity_bias(t_s).to(device=scores.device, dtype=scores.dtype)\n        # attention score masking\n        if mask is not None:\n            # add small value to prevent oor error.\n            scores = scores.masked_fill(mask == 0, -1e4)\n            if self.input_length is not None:\n                block_mask = torch.ones_like(scores).triu(-1 * self.input_length).tril(self.input_length)\n                scores = scores * block_mask + -1e4 * (1 - block_mask)\n        # attention score normalization\n        p_attn = F.softmax(scores, dim=-1)  # [b, n_h, t_t, t_s]\n        # apply dropout to attention weights\n        p_attn = self.dropout(p_attn)\n        # compute output\n        output = torch.matmul(p_attn, value)\n        # relative positional encoding for values\n        if self.rel_attn_window_size is not None:\n            relative_weights = self._absolute_position_to_relative_position(p_attn)\n            value_relative_embeddings = self._get_relative_embeddings(self.emb_rel_v, t_s)\n            output = output + self._matmul_with_relative_values(relative_weights, value_relative_embeddings)\n        output = output.transpose(2, 3).contiguous().view(b, d, t_t)  # [b, n_h, t_t, d_k] -> [b, d, t_t]\n        return output, p_attn\n\n    @staticmethod\n    def _matmul_with_relative_values(p_attn, re):\n        \"\"\"\n        Args:\n            p_attn (Tensor): attention weights.\n            re (Tensor): relative value embedding vector. (a_(i,j)^V)\n\n        Shapes:\n            -p_attn: :math:`[B, H, T, V]`\n            -re: :math:`[H or 1, V, D]`\n            -logits: :math:`[B, H, T, D]`\n        \"\"\"\n        logits = torch.matmul(p_attn, re.unsqueeze(0))\n        return logits\n\n    @staticmethod\n    def _matmul_with_relative_keys(query, re):\n        \"\"\"\n        Args:\n            query (Tensor): batch of query vectors. (x*W^Q)\n            re (Tensor): relative key embedding vector. (a_(i,j)^K)\n\n        Shapes:\n            - query: :math:`[B, H, T, D]`\n            - re: :math:`[H or 1, V, D]`\n            - logits: :math:`[B, H, T, V]`\n        \"\"\"\n        # logits = torch.einsum('bhld, kmd -> bhlm', [query, re.to(query.dtype)])\n        logits = torch.matmul(query, re.unsqueeze(0).transpose(-2, -1))\n        return logits\n\n    def _get_relative_embeddings(self, relative_embeddings, length):\n        \"\"\"Convert embedding vestors to a tensor of embeddings\"\"\"\n        # Pad first before slice to avoid using cond ops.\n        pad_length = max(length - (self.rel_attn_window_size + 1), 0)\n        slice_start_position = max((self.rel_attn_window_size + 1) - length, 0)\n        slice_end_position = slice_start_position + 2 * length - 1\n        if pad_length > 0:\n            padded_relative_embeddings = F.pad(relative_embeddings, [0, 0, pad_length, pad_length, 0, 0])\n        else:\n            padded_relative_embeddings = relative_embeddings\n        used_relative_embeddings = padded_relative_embeddings[:, slice_start_position:slice_end_position]\n        return used_relative_embeddings\n\n    @staticmethod\n    def _relative_position_to_absolute_position(x):\n        \"\"\"Converts tensor from relative to absolute indexing for local attention.\n        Shapes:\n            x: :math:`[B, C, T, 2 * T - 1]`\n        Returns:\n            A Tensor of shape :math:`[B, C, T, T]`\n        \"\"\"\n        batch, heads, length, _ = x.size()\n        # Pad to shift from relative to absolute indexing.\n        x = F.pad(x, [0, 1, 0, 0, 0, 0, 0, 0])\n        # Pad extra elements so to add up to shape (len+1, 2*len-1).\n        x_flat = x.view([batch, heads, length * 2 * length])\n        x_flat = F.pad(x_flat, [0, length - 1, 0, 0, 0, 0])\n        # Reshape and slice out the padded elements.\n        x_final = x_flat.view([batch, heads, length + 1, 2 * length - 1])[:, :, :length, length - 1 :]\n        return x_final\n\n    @staticmethod\n    def _absolute_position_to_relative_position(x):\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, C, T, T]`\n            - ret: :math:`[B, C, T, 2*T-1]`\n        \"\"\"\n        batch, heads, length, _ = x.size()\n        # padd along column\n        x = F.pad(x, [0, length - 1, 0, 0, 0, 0, 0, 0])\n        x_flat = x.view([batch, heads, length**2 + length * (length - 1)])\n        # add 0's in the beginning that will skew the elements after reshape\n        x_flat = F.pad(x_flat, [length, 0, 0, 0, 0, 0])\n        x_final = x_flat.view([batch, heads, length, 2 * length])[:, :, :, 1:]\n        return x_final\n\n    @staticmethod\n    def _attn_proximity_bias(length):\n        \"\"\"Produce an attention mask that discourages distant\n        attention values.\n        Args:\n            length (int): an integer scalar.\n        Returns:\n            a Tensor with shape :math:`[1, 1, T, T]`\n        \"\"\"\n        # L\n        r = torch.arange(length, dtype=torch.float32)\n        # L x L\n        diff = torch.unsqueeze(r, 0) - torch.unsqueeze(r, 1)\n        # scale mask values\n        diff = -torch.log1p(torch.abs(diff))\n        # 1 x 1 x L x L\n        return diff.unsqueeze(0).unsqueeze(0)\n\n\nclass FeedForwardNetwork(nn.Module):\n    \"\"\"Feed Forward Inner layers for Transformer.\n\n    Args:\n        in_channels (int): input tensor channels.\n        out_channels (int): output tensor channels.\n        hidden_channels (int): inner layers hidden channels.\n        kernel_size (int): conv1d filter kernel size.\n        dropout_p (float, optional): dropout rate. Defaults to 0.\n    \"\"\"\n\n    def __init__(self, in_channels, out_channels, hidden_channels, kernel_size, dropout_p=0.0, causal=False):\n        super().__init__()\n        self.in_channels = in_channels\n        self.out_channels = out_channels\n        self.hidden_channels = hidden_channels\n        self.kernel_size = kernel_size\n        self.dropout_p = dropout_p\n\n        if causal:\n            self.padding = self._causal_padding\n        else:\n            self.padding = self._same_padding\n\n        self.conv_1 = nn.Conv1d(in_channels, hidden_channels, kernel_size)\n        self.conv_2 = nn.Conv1d(hidden_channels, out_channels, kernel_size)\n        self.dropout = nn.Dropout(dropout_p)\n\n    def forward(self, x, x_mask):\n        x = self.conv_1(self.padding(x * x_mask))\n        x = torch.relu(x)\n        x = self.dropout(x)\n        x = self.conv_2(self.padding(x * x_mask))\n        return x * x_mask\n\n    def _causal_padding(self, x):\n        if self.kernel_size == 1:\n            return x\n        pad_l = self.kernel_size - 1\n        pad_r = 0\n        padding = [[0, 0], [0, 0], [pad_l, pad_r]]\n        x = F.pad(x, self._pad_shape(padding))\n        return x\n\n    def _same_padding(self, x):\n        if self.kernel_size == 1:\n            return x\n        pad_l = (self.kernel_size - 1) // 2\n        pad_r = self.kernel_size // 2\n        padding = [[0, 0], [0, 0], [pad_l, pad_r]]\n        x = F.pad(x, self._pad_shape(padding))\n        return x\n\n    @staticmethod\n    def _pad_shape(padding):\n        l = padding[::-1]\n        pad_shape = [item for sublist in l for item in sublist]\n        return pad_shape\n\n\nclass RelativePositionTransformer(nn.Module):\n    \"\"\"Transformer with Relative Potional Encoding.\n    https://arxiv.org/abs/1803.02155\n\n    Args:\n        in_channels (int): number of channels of the input tensor.\n        out_chanels (int): number of channels of the output tensor.\n        hidden_channels (int): model hidden channels.\n        hidden_channels_ffn (int): hidden channels of FeedForwardNetwork.\n        num_heads (int): number of attention heads.\n        num_layers (int): number of transformer layers.\n        kernel_size (int, optional): kernel size of feed-forward inner layers. Defaults to 1.\n        dropout_p (float, optional): dropout rate for self-attention and feed-forward inner layers_per_stack. Defaults to 0.\n        rel_attn_window_size (int, optional): relation attention window size.\n            If 4, for each time step next and previous 4 time steps are attended.\n            If default, relative encoding is disabled and it is a regular transformer.\n            Defaults to None.\n        input_length (int, optional): input lenght to limit position encoding. Defaults to None.\n        layer_norm_type (str, optional): type \"1\" uses torch tensor operations and type \"2\" uses torch layer_norm\n            primitive. Use type \"2\", type \"1: is for backward compat. Defaults to \"1\".\n    \"\"\"\n\n    def __init__(\n        self,\n        in_channels: int,\n        out_channels: int,\n        hidden_channels: int,\n        hidden_channels_ffn: int,\n        num_heads: int,\n        num_layers: int,\n        kernel_size=1,\n        dropout_p=0.0,\n        rel_attn_window_size: int = None,\n        input_length: int = None,\n        layer_norm_type: str = \"1\",\n    ):\n        super().__init__()\n        self.hidden_channels = hidden_channels\n        self.hidden_channels_ffn = hidden_channels_ffn\n        self.num_heads = num_heads\n        self.num_layers = num_layers\n        self.kernel_size = kernel_size\n        self.dropout_p = dropout_p\n        self.rel_attn_window_size = rel_attn_window_size\n\n        self.dropout = nn.Dropout(dropout_p)\n        self.attn_layers = nn.ModuleList()\n        self.norm_layers_1 = nn.ModuleList()\n        self.ffn_layers = nn.ModuleList()\n        self.norm_layers_2 = nn.ModuleList()\n\n        for idx in range(self.num_layers):\n            self.attn_layers.append(\n                RelativePositionMultiHeadAttention(\n                    hidden_channels if idx != 0 else in_channels,\n                    hidden_channels,\n                    num_heads,\n                    rel_attn_window_size=rel_attn_window_size,\n                    dropout_p=dropout_p,\n                    input_length=input_length,\n                )\n            )\n            if layer_norm_type == \"1\":\n                self.norm_layers_1.append(LayerNorm(hidden_channels))\n            elif layer_norm_type == \"2\":\n                self.norm_layers_1.append(LayerNorm2(hidden_channels))\n            else:\n                raise ValueError(\" [!] Unknown layer norm type\")\n\n            if hidden_channels != out_channels and (idx + 1) == self.num_layers:\n                self.proj = nn.Conv1d(hidden_channels, out_channels, 1)\n\n            self.ffn_layers.append(\n                FeedForwardNetwork(\n                    hidden_channels,\n                    hidden_channels if (idx + 1) != self.num_layers else out_channels,\n                    hidden_channels_ffn,\n                    kernel_size,\n                    dropout_p=dropout_p,\n                )\n            )\n\n            if layer_norm_type == \"1\":\n                self.norm_layers_2.append(LayerNorm(hidden_channels if (idx + 1) != self.num_layers else out_channels))\n            elif layer_norm_type == \"2\":\n                self.norm_layers_2.append(LayerNorm2(hidden_channels if (idx + 1) != self.num_layers else out_channels))\n            else:\n                raise ValueError(\" [!] Unknown layer norm type\")\n\n    def forward(self, x, x_mask):\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - x_mask: :math:`[B, 1, T]`\n        \"\"\"\n        attn_mask = x_mask.unsqueeze(2) * x_mask.unsqueeze(-1)\n        for i in range(self.num_layers):\n            x = x * x_mask\n            y = self.attn_layers[i](x, x, attn_mask)\n            y = self.dropout(y)\n            x = self.norm_layers_1[i](x + y)\n\n            y = self.ffn_layers[i](x, x_mask)\n            y = self.dropout(y)\n\n            if (i + 1) == self.num_layers and hasattr(self, \"proj\"):\n                x = self.proj(x)\n\n            x = self.norm_layers_2[i](x + y)\n        x = x * x_mask\n        return x\n"
  },
  {
    "path": "TTS/tts/layers/losses.py",
    "content": "import math\n\nimport numpy as np\nimport torch\nfrom coqpit import Coqpit\nfrom torch import nn\nfrom torch.nn import functional\n\nfrom TTS.tts.utils.helpers import sequence_mask\nfrom TTS.tts.utils.ssim import SSIMLoss as _SSIMLoss\nfrom TTS.utils.audio.torch_transforms import TorchSTFT\n\n\n# pylint: disable=abstract-method\n# relates https://github.com/pytorch/pytorch/issues/42305\nclass L1LossMasked(nn.Module):\n    def __init__(self, seq_len_norm):\n        super().__init__()\n        self.seq_len_norm = seq_len_norm\n\n    def forward(self, x, target, length):\n        \"\"\"\n        Args:\n            x: A Variable containing a FloatTensor of size\n                (batch, max_len, dim) which contains the\n                unnormalized probability for each class.\n            target: A Variable containing a LongTensor of size\n                (batch, max_len, dim) which contains the index of the true\n                class for each corresponding step.\n            length: A Variable containing a LongTensor of size (batch,)\n                which contains the length of each data in a batch.\n        Shapes:\n            x: B x T X D\n            target: B x T x D\n            length: B\n        Returns:\n            loss: An average loss value in range [0, 1] masked by the length.\n        \"\"\"\n        # mask: (batch, max_len, 1)\n        target.requires_grad = False\n        mask = sequence_mask(sequence_length=length, max_len=target.size(1)).unsqueeze(2).float()\n        if self.seq_len_norm:\n            norm_w = mask / mask.sum(dim=1, keepdim=True)\n            out_weights = norm_w.div(target.shape[0] * target.shape[2])\n            mask = mask.expand_as(x)\n            loss = functional.l1_loss(x * mask, target * mask, reduction=\"none\")\n            loss = loss.mul(out_weights.to(loss.device)).sum()\n        else:\n            mask = mask.expand_as(x)\n            loss = functional.l1_loss(x * mask, target * mask, reduction=\"sum\")\n            loss = loss / mask.sum()\n        return loss\n\n\nclass MSELossMasked(nn.Module):\n    def __init__(self, seq_len_norm):\n        super().__init__()\n        self.seq_len_norm = seq_len_norm\n\n    def forward(self, x, target, length):\n        \"\"\"\n        Args:\n            x: A Variable containing a FloatTensor of size\n                (batch, max_len, dim) which contains the\n                unnormalized probability for each class.\n            target: A Variable containing a LongTensor of size\n                (batch, max_len, dim) which contains the index of the true\n                class for each corresponding step.\n            length: A Variable containing a LongTensor of size (batch,)\n                which contains the length of each data in a batch.\n        Shapes:\n            - x: :math:`[B, T, D]`\n            - target: :math:`[B, T, D]`\n            - length: :math:`B`\n        Returns:\n            loss: An average loss value in range [0, 1] masked by the length.\n        \"\"\"\n        # mask: (batch, max_len, 1)\n        target.requires_grad = False\n        mask = sequence_mask(sequence_length=length, max_len=target.size(1)).unsqueeze(2).float()\n        if self.seq_len_norm:\n            norm_w = mask / mask.sum(dim=1, keepdim=True)\n            out_weights = norm_w.div(target.shape[0] * target.shape[2])\n            mask = mask.expand_as(x)\n            loss = functional.mse_loss(x * mask, target * mask, reduction=\"none\")\n            loss = loss.mul(out_weights.to(loss.device)).sum()\n        else:\n            mask = mask.expand_as(x)\n            loss = functional.mse_loss(x * mask, target * mask, reduction=\"sum\")\n            loss = loss / mask.sum()\n        return loss\n\n\ndef sample_wise_min_max(x: torch.Tensor, mask: torch.Tensor) -> torch.Tensor:\n    \"\"\"Min-Max normalize tensor through first dimension\n    Shapes:\n        - x: :math:`[B, D1, D2]`\n        - m: :math:`[B, D1, 1]`\n    \"\"\"\n    maximum = torch.amax(x.masked_fill(~mask, 0), dim=(1, 2), keepdim=True)\n    minimum = torch.amin(x.masked_fill(~mask, np.inf), dim=(1, 2), keepdim=True)\n    return (x - minimum) / (maximum - minimum + 1e-8)\n\n\nclass SSIMLoss(torch.nn.Module):\n    \"\"\"SSIM loss as (1 - SSIM)\n    SSIM is explained here https://en.wikipedia.org/wiki/Structural_similarity\n    \"\"\"\n\n    def __init__(self):\n        super().__init__()\n        self.loss_func = _SSIMLoss()\n\n    def forward(self, y_hat, y, length):\n        \"\"\"\n        Args:\n            y_hat (tensor): model prediction values.\n            y (tensor): target values.\n            length (tensor): length of each sample in a batch for masking.\n\n        Shapes:\n            y_hat: B x T X D\n            y: B x T x D\n            length: B\n\n         Returns:\n            loss: An average loss value in range [0, 1] masked by the length.\n        \"\"\"\n        mask = sequence_mask(sequence_length=length, max_len=y.size(1)).unsqueeze(2)\n        y_norm = sample_wise_min_max(y, mask)\n        y_hat_norm = sample_wise_min_max(y_hat, mask)\n        ssim_loss = self.loss_func((y_norm * mask).unsqueeze(1), (y_hat_norm * mask).unsqueeze(1))\n\n        if ssim_loss.item() > 1.0:\n            print(f\" > SSIM loss is out-of-range {ssim_loss.item()}, setting it 1.0\")\n            ssim_loss = torch.tensor(1.0, device=ssim_loss.device)\n\n        if ssim_loss.item() < 0.0:\n            print(f\" > SSIM loss is out-of-range {ssim_loss.item()}, setting it 0.0\")\n            ssim_loss = torch.tensor(0.0, device=ssim_loss.device)\n\n        return ssim_loss\n\n\nclass AttentionEntropyLoss(nn.Module):\n    # pylint: disable=R0201\n    def forward(self, align):\n        \"\"\"\n        Forces attention to be more decisive by penalizing\n        soft attention weights\n        \"\"\"\n        entropy = torch.distributions.Categorical(probs=align).entropy()\n        loss = (entropy / np.log(align.shape[1])).mean()\n        return loss\n\n\nclass BCELossMasked(nn.Module):\n    \"\"\"BCE loss with masking.\n\n    Used mainly for stopnet in autoregressive models.\n\n    Args:\n        pos_weight (float): weight for positive samples. If set < 1, penalize early stopping. Defaults to None.\n    \"\"\"\n\n    def __init__(self, pos_weight: float = None):\n        super().__init__()\n        self.pos_weight = nn.Parameter(torch.tensor([pos_weight]), requires_grad=False)\n\n    def forward(self, x, target, length):\n        \"\"\"\n        Args:\n            x: A Variable containing a FloatTensor of size\n                (batch, max_len) which contains the\n                unnormalized probability for each class.\n            target: A Variable containing a LongTensor of size\n                (batch, max_len) which contains the index of the true\n                class for each corresponding step.\n            length: A Variable containing a LongTensor of size (batch,)\n                which contains the length of each data in a batch.\n        Shapes:\n            x: B x T\n            target: B x T\n            length: B\n        Returns:\n            loss: An average loss value in range [0, 1] masked by the length.\n        \"\"\"\n        target.requires_grad = False\n        if length is not None:\n            # mask: (batch, max_len, 1)\n            mask = sequence_mask(sequence_length=length, max_len=target.size(1))\n            num_items = mask.sum()\n            loss = functional.binary_cross_entropy_with_logits(\n                x.masked_select(mask), target.masked_select(mask), pos_weight=self.pos_weight, reduction=\"sum\"\n            )\n        else:\n            loss = functional.binary_cross_entropy_with_logits(x, target, pos_weight=self.pos_weight, reduction=\"sum\")\n            num_items = torch.numel(x)\n        loss = loss / num_items\n        return loss\n\n\nclass DifferentialSpectralLoss(nn.Module):\n    \"\"\"Differential Spectral Loss\n    https://arxiv.org/ftp/arxiv/papers/1909/1909.10302.pdf\"\"\"\n\n    def __init__(self, loss_func):\n        super().__init__()\n        self.loss_func = loss_func\n\n    def forward(self, x, target, length=None):\n        \"\"\"\n         Shapes:\n            x: B x T\n            target: B x T\n            length: B\n        Returns:\n            loss: An average loss value in range [0, 1] masked by the length.\n        \"\"\"\n        x_diff = x[:, 1:] - x[:, :-1]\n        target_diff = target[:, 1:] - target[:, :-1]\n        if length is None:\n            return self.loss_func(x_diff, target_diff)\n        return self.loss_func(x_diff, target_diff, length - 1)\n\n\nclass GuidedAttentionLoss(torch.nn.Module):\n    def __init__(self, sigma=0.4):\n        super().__init__()\n        self.sigma = sigma\n\n    def _make_ga_masks(self, ilens, olens):\n        B = len(ilens)\n        max_ilen = max(ilens)\n        max_olen = max(olens)\n        ga_masks = torch.zeros((B, max_olen, max_ilen))\n        for idx, (ilen, olen) in enumerate(zip(ilens, olens)):\n            ga_masks[idx, :olen, :ilen] = self._make_ga_mask(ilen, olen, self.sigma)\n        return ga_masks\n\n    def forward(self, att_ws, ilens, olens):\n        ga_masks = self._make_ga_masks(ilens, olens).to(att_ws.device)\n        seq_masks = self._make_masks(ilens, olens).to(att_ws.device)\n        losses = ga_masks * att_ws\n        loss = torch.mean(losses.masked_select(seq_masks))\n        return loss\n\n    @staticmethod\n    def _make_ga_mask(ilen, olen, sigma):\n        grid_x, grid_y = torch.meshgrid(torch.arange(olen).to(olen), torch.arange(ilen).to(ilen))\n        grid_x, grid_y = grid_x.float(), grid_y.float()\n        return 1.0 - torch.exp(-((grid_y / ilen - grid_x / olen) ** 2) / (2 * (sigma**2)))\n\n    @staticmethod\n    def _make_masks(ilens, olens):\n        in_masks = sequence_mask(ilens)\n        out_masks = sequence_mask(olens)\n        return out_masks.unsqueeze(-1) & in_masks.unsqueeze(-2)\n\n\nclass Huber(nn.Module):\n    # pylint: disable=R0201\n    def forward(self, x, y, length=None):\n        \"\"\"\n        Shapes:\n            x: B x T\n            y: B x T\n            length: B\n        \"\"\"\n        mask = sequence_mask(sequence_length=length, max_len=y.size(1)).unsqueeze(2).float()\n        return torch.nn.functional.smooth_l1_loss(x * mask, y * mask, reduction=\"sum\") / mask.sum()\n\n\nclass ForwardSumLoss(nn.Module):\n    def __init__(self, blank_logprob=-1):\n        super().__init__()\n        self.log_softmax = torch.nn.LogSoftmax(dim=3)\n        self.ctc_loss = torch.nn.CTCLoss(zero_infinity=True)\n        self.blank_logprob = blank_logprob\n\n    def forward(self, attn_logprob, in_lens, out_lens):\n        key_lens = in_lens\n        query_lens = out_lens\n        attn_logprob_padded = torch.nn.functional.pad(input=attn_logprob, pad=(1, 0), value=self.blank_logprob)\n\n        total_loss = 0.0\n        for bid in range(attn_logprob.shape[0]):\n            target_seq = torch.arange(1, key_lens[bid] + 1).unsqueeze(0)\n            curr_logprob = attn_logprob_padded[bid].permute(1, 0, 2)[: query_lens[bid], :, : key_lens[bid] + 1]\n\n            curr_logprob = self.log_softmax(curr_logprob[None])[0]\n            loss = self.ctc_loss(\n                curr_logprob,\n                target_seq,\n                input_lengths=query_lens[bid : bid + 1],\n                target_lengths=key_lens[bid : bid + 1],\n            )\n            total_loss = total_loss + loss\n\n        total_loss = total_loss / attn_logprob.shape[0]\n        return total_loss\n\n\n########################\n# MODEL LOSS LAYERS\n########################\n\n\nclass TacotronLoss(torch.nn.Module):\n    \"\"\"Collection of Tacotron set-up based on provided config.\"\"\"\n\n    def __init__(self, c, ga_sigma=0.4):\n        super().__init__()\n        self.stopnet_pos_weight = c.stopnet_pos_weight\n        self.use_capacitron_vae = c.use_capacitron_vae\n        if self.use_capacitron_vae:\n            self.capacitron_capacity = c.capacitron_vae.capacitron_capacity\n            self.capacitron_vae_loss_alpha = c.capacitron_vae.capacitron_VAE_loss_alpha\n        self.ga_alpha = c.ga_alpha\n        self.decoder_diff_spec_alpha = c.decoder_diff_spec_alpha\n        self.postnet_diff_spec_alpha = c.postnet_diff_spec_alpha\n        self.decoder_alpha = c.decoder_loss_alpha\n        self.postnet_alpha = c.postnet_loss_alpha\n        self.decoder_ssim_alpha = c.decoder_ssim_alpha\n        self.postnet_ssim_alpha = c.postnet_ssim_alpha\n        self.config = c\n\n        # postnet and decoder loss\n        if c.loss_masking:\n            self.criterion = L1LossMasked(c.seq_len_norm) if c.model in [\"Tacotron\"] else MSELossMasked(c.seq_len_norm)\n        else:\n            self.criterion = nn.L1Loss() if c.model in [\"Tacotron\"] else nn.MSELoss()\n        # guided attention loss\n        if c.ga_alpha > 0:\n            self.criterion_ga = GuidedAttentionLoss(sigma=ga_sigma)\n        # differential spectral loss\n        if c.postnet_diff_spec_alpha > 0 or c.decoder_diff_spec_alpha > 0:\n            self.criterion_diff_spec = DifferentialSpectralLoss(loss_func=self.criterion)\n        # ssim loss\n        if c.postnet_ssim_alpha > 0 or c.decoder_ssim_alpha > 0:\n            self.criterion_ssim = SSIMLoss()\n        # stopnet loss\n        # pylint: disable=not-callable\n        self.criterion_st = BCELossMasked(pos_weight=torch.tensor(self.stopnet_pos_weight)) if c.stopnet else None\n\n        # For dev pruposes only\n        self.criterion_capacitron_reconstruction_loss = nn.L1Loss(reduction=\"sum\")\n\n    def forward(\n        self,\n        postnet_output,\n        decoder_output,\n        mel_input,\n        linear_input,\n        stopnet_output,\n        stopnet_target,\n        stop_target_length,\n        capacitron_vae_outputs,\n        output_lens,\n        decoder_b_output,\n        alignments,\n        alignment_lens,\n        alignments_backwards,\n        input_lens,\n    ):\n        # decoder outputs linear or mel spectrograms for Tacotron and Tacotron2\n        # the target should be set acccordingly\n        postnet_target = linear_input if self.config.model.lower() in [\"tacotron\"] else mel_input\n\n        return_dict = {}\n        # remove lengths if no masking is applied\n        if not self.config.loss_masking:\n            output_lens = None\n        # decoder and postnet losses\n        if self.config.loss_masking:\n            if self.decoder_alpha > 0:\n                decoder_loss = self.criterion(decoder_output, mel_input, output_lens)\n            if self.postnet_alpha > 0:\n                postnet_loss = self.criterion(postnet_output, postnet_target, output_lens)\n        else:\n            if self.decoder_alpha > 0:\n                decoder_loss = self.criterion(decoder_output, mel_input)\n            if self.postnet_alpha > 0:\n                postnet_loss = self.criterion(postnet_output, postnet_target)\n        loss = self.decoder_alpha * decoder_loss + self.postnet_alpha * postnet_loss\n        return_dict[\"decoder_loss\"] = decoder_loss\n        return_dict[\"postnet_loss\"] = postnet_loss\n\n        if self.use_capacitron_vae:\n            # extract capacitron vae infos\n            posterior_distribution, prior_distribution, beta = capacitron_vae_outputs\n\n            # KL divergence term between the posterior and the prior\n            kl_term = torch.mean(torch.distributions.kl_divergence(posterior_distribution, prior_distribution))\n\n            # Limit the mutual information between the data and latent space by the variational capacity limit\n            kl_capacity = kl_term - self.capacitron_capacity\n\n            # pass beta through softplus to keep it positive\n            beta = torch.nn.functional.softplus(beta)[0]\n\n            # This is the term going to the main ADAM optimiser, we detach beta because\n            # beta is optimised by a separate, SGD optimiser below\n            capacitron_vae_loss = beta.detach() * kl_capacity\n\n            # normalize the capacitron_vae_loss as in L1Loss or MSELoss.\n            # After this, both the standard loss and capacitron_vae_loss will be in the same scale.\n            # For this reason we don't need use L1Loss and MSELoss in \"sum\" reduction mode.\n            # Note: the batch is not considered because the L1Loss was calculated in \"sum\" mode\n            # divided by the batch size, So not dividing the capacitron_vae_loss by B is legitimate.\n\n            # get B T D dimension from input\n            B, T, D = mel_input.size()\n            # normalize\n            if self.config.loss_masking:\n                # if mask loss get T using the mask\n                T = output_lens.sum() / B\n\n            # Only for dev purposes to be able to compare the reconstruction loss with the values in the\n            # original Capacitron paper\n            return_dict[\"capaciton_reconstruction_loss\"] = (\n                self.criterion_capacitron_reconstruction_loss(decoder_output, mel_input) / decoder_output.size(0)\n            ) + kl_capacity\n\n            capacitron_vae_loss = capacitron_vae_loss / (T * D)\n            capacitron_vae_loss = capacitron_vae_loss * self.capacitron_vae_loss_alpha\n\n            # This is the term to purely optimise beta and to pass into the SGD optimizer\n            beta_loss = torch.negative(beta) * kl_capacity.detach()\n\n            loss += capacitron_vae_loss\n\n            return_dict[\"capacitron_vae_loss\"] = capacitron_vae_loss\n            return_dict[\"capacitron_vae_beta_loss\"] = beta_loss\n            return_dict[\"capacitron_vae_kl_term\"] = kl_term\n            return_dict[\"capacitron_beta\"] = beta\n\n        stop_loss = (\n            self.criterion_st(stopnet_output, stopnet_target, stop_target_length)\n            if self.config.stopnet\n            else torch.zeros(1)\n        )\n        loss += stop_loss\n        return_dict[\"stopnet_loss\"] = stop_loss\n\n        # backward decoder loss (if enabled)\n        if self.config.bidirectional_decoder:\n            if self.config.loss_masking:\n                decoder_b_loss = self.criterion(torch.flip(decoder_b_output, dims=(1,)), mel_input, output_lens)\n            else:\n                decoder_b_loss = self.criterion(torch.flip(decoder_b_output, dims=(1,)), mel_input)\n            decoder_c_loss = torch.nn.functional.l1_loss(torch.flip(decoder_b_output, dims=(1,)), decoder_output)\n            loss += self.decoder_alpha * (decoder_b_loss + decoder_c_loss)\n            return_dict[\"decoder_b_loss\"] = decoder_b_loss\n            return_dict[\"decoder_c_loss\"] = decoder_c_loss\n\n        # double decoder consistency loss (if enabled)\n        if self.config.double_decoder_consistency:\n            if self.config.loss_masking:\n                decoder_b_loss = self.criterion(decoder_b_output, mel_input, output_lens)\n            else:\n                decoder_b_loss = self.criterion(decoder_b_output, mel_input)\n            # decoder_c_loss = torch.nn.functional.l1_loss(decoder_b_output, decoder_output)\n            attention_c_loss = torch.nn.functional.l1_loss(alignments, alignments_backwards)\n            loss += self.decoder_alpha * (decoder_b_loss + attention_c_loss)\n            return_dict[\"decoder_coarse_loss\"] = decoder_b_loss\n            return_dict[\"decoder_ddc_loss\"] = attention_c_loss\n\n        # guided attention loss (if enabled)\n        if self.config.ga_alpha > 0:\n            ga_loss = self.criterion_ga(alignments, input_lens, alignment_lens)\n            loss += ga_loss * self.ga_alpha\n            return_dict[\"ga_loss\"] = ga_loss\n\n        # decoder differential spectral loss\n        if self.config.decoder_diff_spec_alpha > 0:\n            decoder_diff_spec_loss = self.criterion_diff_spec(decoder_output, mel_input, output_lens)\n            loss += decoder_diff_spec_loss * self.decoder_diff_spec_alpha\n            return_dict[\"decoder_diff_spec_loss\"] = decoder_diff_spec_loss\n\n        # postnet differential spectral loss\n        if self.config.postnet_diff_spec_alpha > 0:\n            postnet_diff_spec_loss = self.criterion_diff_spec(postnet_output, postnet_target, output_lens)\n            loss += postnet_diff_spec_loss * self.postnet_diff_spec_alpha\n            return_dict[\"postnet_diff_spec_loss\"] = postnet_diff_spec_loss\n\n        # decoder ssim loss\n        if self.config.decoder_ssim_alpha > 0:\n            decoder_ssim_loss = self.criterion_ssim(decoder_output, mel_input, output_lens)\n            loss += decoder_ssim_loss * self.postnet_ssim_alpha\n            return_dict[\"decoder_ssim_loss\"] = decoder_ssim_loss\n\n        # postnet ssim loss\n        if self.config.postnet_ssim_alpha > 0:\n            postnet_ssim_loss = self.criterion_ssim(postnet_output, postnet_target, output_lens)\n            loss += postnet_ssim_loss * self.postnet_ssim_alpha\n            return_dict[\"postnet_ssim_loss\"] = postnet_ssim_loss\n\n        return_dict[\"loss\"] = loss\n        return return_dict\n\n\nclass GlowTTSLoss(torch.nn.Module):\n    def __init__(self):\n        super().__init__()\n        self.constant_factor = 0.5 * math.log(2 * math.pi)\n\n    def forward(self, z, means, scales, log_det, y_lengths, o_dur_log, o_attn_dur, x_lengths):\n        return_dict = {}\n        # flow loss - neg log likelihood\n        pz = torch.sum(scales) + 0.5 * torch.sum(torch.exp(-2 * scales) * (z - means) ** 2)\n        log_mle = self.constant_factor + (pz - torch.sum(log_det)) / (torch.sum(y_lengths) * z.shape[2])\n        # duration loss - MSE\n        loss_dur = torch.sum((o_dur_log - o_attn_dur) ** 2) / torch.sum(x_lengths)\n        # duration loss - huber loss\n        # loss_dur = torch.nn.functional.smooth_l1_loss(o_dur_log, o_attn_dur, reduction=\"sum\") / torch.sum(x_lengths)\n        return_dict[\"loss\"] = log_mle + loss_dur\n        return_dict[\"log_mle\"] = log_mle\n        return_dict[\"loss_dur\"] = loss_dur\n\n        # check if any loss is NaN\n        for key, loss in return_dict.items():\n            if torch.isnan(loss):\n                raise RuntimeError(f\" [!] NaN loss with {key}.\")\n        return return_dict\n\n\ndef mse_loss_custom(x, y):\n    \"\"\"MSE loss using the torch back-end without reduction.\n    It uses less VRAM than the raw code\"\"\"\n    expanded_x, expanded_y = torch.broadcast_tensors(x, y)\n    return torch._C._nn.mse_loss(expanded_x, expanded_y, 0)  # pylint: disable=protected-access, c-extension-no-member\n\n\nclass MDNLoss(nn.Module):\n    \"\"\"Mixture of Density Network Loss as described in https://arxiv.org/pdf/2003.01950.pdf.\"\"\"\n\n    def forward(self, logp, text_lengths, mel_lengths):  # pylint: disable=no-self-use\n        \"\"\"\n        Shapes:\n            mu: [B, D, T]\n            log_sigma: [B, D, T]\n            mel_spec: [B, D, T]\n        \"\"\"\n        B, T_seq, T_mel = logp.shape\n        log_alpha = logp.new_ones(B, T_seq, T_mel) * (-1e4)\n        log_alpha[:, 0, 0] = logp[:, 0, 0]\n        for t in range(1, T_mel):\n            prev_step = torch.cat(\n                [log_alpha[:, :, t - 1 : t], functional.pad(log_alpha[:, :, t - 1 : t], (0, 0, 1, -1), value=-1e4)],\n                dim=-1,\n            )\n            log_alpha[:, :, t] = torch.logsumexp(prev_step + 1e-4, dim=-1) + logp[:, :, t]\n        alpha_last = log_alpha[torch.arange(B), text_lengths - 1, mel_lengths - 1]\n        mdn_loss = -alpha_last.mean() / T_seq\n        return mdn_loss  # , log_prob_matrix\n\n\nclass AlignTTSLoss(nn.Module):\n    \"\"\"Modified AlignTTS Loss.\n    Computes\n        - L1 and SSIM losses from output spectrograms.\n        - Huber loss for duration predictor.\n        - MDNLoss for Mixture of Density Network.\n\n    All loss values are aggregated by a weighted sum of the alpha values.\n\n    Args:\n        c (dict): TTS model configuration.\n    \"\"\"\n\n    def __init__(self, c):\n        super().__init__()\n        self.mdn_loss = MDNLoss()\n        self.spec_loss = MSELossMasked(False)\n        self.ssim = SSIMLoss()\n        self.dur_loss = MSELossMasked(False)\n\n        self.ssim_alpha = c.ssim_alpha\n        self.dur_loss_alpha = c.dur_loss_alpha\n        self.spec_loss_alpha = c.spec_loss_alpha\n        self.mdn_alpha = c.mdn_alpha\n\n    def forward(\n        self, logp, decoder_output, decoder_target, decoder_output_lens, dur_output, dur_target, input_lens, phase\n    ):\n        # ssim_alpha, dur_loss_alpha, spec_loss_alpha, mdn_alpha = self.set_alphas(step)\n        spec_loss, ssim_loss, dur_loss, mdn_loss = 0, 0, 0, 0\n        if phase == 0:\n            mdn_loss = self.mdn_loss(logp, input_lens, decoder_output_lens)\n        elif phase == 1:\n            spec_loss = self.spec_loss(decoder_output, decoder_target, decoder_output_lens)\n            ssim_loss = self.ssim(decoder_output, decoder_target, decoder_output_lens)\n        elif phase == 2:\n            mdn_loss = self.mdn_loss(logp, input_lens, decoder_output_lens)\n            spec_loss = self.spec_lossX(decoder_output, decoder_target, decoder_output_lens)\n            ssim_loss = self.ssim(decoder_output, decoder_target, decoder_output_lens)\n        elif phase == 3:\n            dur_loss = self.dur_loss(dur_output.unsqueeze(2), dur_target.unsqueeze(2), input_lens)\n        else:\n            mdn_loss = self.mdn_loss(logp, input_lens, decoder_output_lens)\n            spec_loss = self.spec_loss(decoder_output, decoder_target, decoder_output_lens)\n            ssim_loss = self.ssim(decoder_output, decoder_target, decoder_output_lens)\n            dur_loss = self.dur_loss(dur_output.unsqueeze(2), dur_target.unsqueeze(2), input_lens)\n        loss = (\n            self.spec_loss_alpha * spec_loss\n            + self.ssim_alpha * ssim_loss\n            + self.dur_loss_alpha * dur_loss\n            + self.mdn_alpha * mdn_loss\n        )\n        return {\"loss\": loss, \"loss_l1\": spec_loss, \"loss_ssim\": ssim_loss, \"loss_dur\": dur_loss, \"mdn_loss\": mdn_loss}\n\n\nclass VitsGeneratorLoss(nn.Module):\n    def __init__(self, c: Coqpit):\n        super().__init__()\n        self.kl_loss_alpha = c.kl_loss_alpha\n        self.gen_loss_alpha = c.gen_loss_alpha\n        self.feat_loss_alpha = c.feat_loss_alpha\n        self.dur_loss_alpha = c.dur_loss_alpha\n        self.mel_loss_alpha = c.mel_loss_alpha\n        self.spk_encoder_loss_alpha = c.speaker_encoder_loss_alpha\n        self.stft = TorchSTFT(\n            c.audio.fft_size,\n            c.audio.hop_length,\n            c.audio.win_length,\n            sample_rate=c.audio.sample_rate,\n            mel_fmin=c.audio.mel_fmin,\n            mel_fmax=c.audio.mel_fmax,\n            n_mels=c.audio.num_mels,\n            use_mel=True,\n            do_amp_to_db=True,\n        )\n\n    @staticmethod\n    def feature_loss(feats_real, feats_generated):\n        loss = 0\n        for dr, dg in zip(feats_real, feats_generated):\n            for rl, gl in zip(dr, dg):\n                rl = rl.float().detach()\n                gl = gl.float()\n                loss += torch.mean(torch.abs(rl - gl))\n        return loss * 2\n\n    @staticmethod\n    def generator_loss(scores_fake):\n        loss = 0\n        gen_losses = []\n        for dg in scores_fake:\n            dg = dg.float()\n            l = torch.mean((1 - dg) ** 2)\n            gen_losses.append(l)\n            loss += l\n\n        return loss, gen_losses\n\n    @staticmethod\n    def kl_loss(z_p, logs_q, m_p, logs_p, z_mask):\n        \"\"\"\n        z_p, logs_q: [b, h, t_t]\n        m_p, logs_p: [b, h, t_t]\n        \"\"\"\n        z_p = z_p.float()\n        logs_q = logs_q.float()\n        m_p = m_p.float()\n        logs_p = logs_p.float()\n        z_mask = z_mask.float()\n\n        kl = logs_p - logs_q - 0.5\n        kl += 0.5 * ((z_p - m_p) ** 2) * torch.exp(-2.0 * logs_p)\n        kl = torch.sum(kl * z_mask)\n        l = kl / torch.sum(z_mask)\n        return l\n\n    @staticmethod\n    def cosine_similarity_loss(gt_spk_emb, syn_spk_emb):\n        return -torch.nn.functional.cosine_similarity(gt_spk_emb, syn_spk_emb).mean()\n\n    def forward(\n        self,\n        mel_slice,\n        mel_slice_hat,\n        z_p,\n        logs_q,\n        m_p,\n        logs_p,\n        z_len,\n        scores_disc_fake,\n        feats_disc_fake,\n        feats_disc_real,\n        loss_duration,\n        use_speaker_encoder_as_loss=False,\n        gt_spk_emb=None,\n        syn_spk_emb=None,\n    ):\n        \"\"\"\n        Shapes:\n            - mel_slice : :math:`[B, 1, T]`\n            - mel_slice_hat: :math:`[B, 1, T]`\n            - z_p: :math:`[B, C, T]`\n            - logs_q: :math:`[B, C, T]`\n            - m_p: :math:`[B, C, T]`\n            - logs_p: :math:`[B, C, T]`\n            - z_len: :math:`[B]`\n            - scores_disc_fake[i]: :math:`[B, C]`\n            - feats_disc_fake[i][j]: :math:`[B, C, T', P]`\n            - feats_disc_real[i][j]: :math:`[B, C, T', P]`\n        \"\"\"\n        loss = 0.0\n        return_dict = {}\n        z_mask = sequence_mask(z_len).float()\n        # compute losses\n        loss_kl = (\n            self.kl_loss(z_p=z_p, logs_q=logs_q, m_p=m_p, logs_p=logs_p, z_mask=z_mask.unsqueeze(1))\n            * self.kl_loss_alpha\n        )\n        loss_feat = (\n            self.feature_loss(feats_real=feats_disc_real, feats_generated=feats_disc_fake) * self.feat_loss_alpha\n        )\n        loss_gen = self.generator_loss(scores_fake=scores_disc_fake)[0] * self.gen_loss_alpha\n        loss_mel = torch.nn.functional.l1_loss(mel_slice, mel_slice_hat) * self.mel_loss_alpha\n        loss_duration = torch.sum(loss_duration.float()) * self.dur_loss_alpha\n        loss = loss_kl + loss_feat + loss_mel + loss_gen + loss_duration\n\n        if use_speaker_encoder_as_loss:\n            loss_se = self.cosine_similarity_loss(gt_spk_emb, syn_spk_emb) * self.spk_encoder_loss_alpha\n            loss = loss + loss_se\n            return_dict[\"loss_spk_encoder\"] = loss_se\n        # pass losses to the dict\n        return_dict[\"loss_gen\"] = loss_gen\n        return_dict[\"loss_kl\"] = loss_kl\n        return_dict[\"loss_feat\"] = loss_feat\n        return_dict[\"loss_mel\"] = loss_mel\n        return_dict[\"loss_duration\"] = loss_duration\n        return_dict[\"loss\"] = loss\n        return return_dict\n\n\nclass VitsDiscriminatorLoss(nn.Module):\n    def __init__(self, c: Coqpit):\n        super().__init__()\n        self.disc_loss_alpha = c.disc_loss_alpha\n\n    @staticmethod\n    def discriminator_loss(scores_real, scores_fake):\n        loss = 0\n        real_losses = []\n        fake_losses = []\n        for dr, dg in zip(scores_real, scores_fake):\n            dr = dr.float()\n            dg = dg.float()\n            real_loss = torch.mean((1 - dr) ** 2)\n            fake_loss = torch.mean(dg**2)\n            loss += real_loss + fake_loss\n            real_losses.append(real_loss.item())\n            fake_losses.append(fake_loss.item())\n        return loss, real_losses, fake_losses\n\n    def forward(self, scores_disc_real, scores_disc_fake):\n        loss = 0.0\n        return_dict = {}\n        loss_disc, loss_disc_real, _ = self.discriminator_loss(\n            scores_real=scores_disc_real, scores_fake=scores_disc_fake\n        )\n        return_dict[\"loss_disc\"] = loss_disc * self.disc_loss_alpha\n        loss = loss + return_dict[\"loss_disc\"]\n        return_dict[\"loss\"] = loss\n\n        for i, ldr in enumerate(loss_disc_real):\n            return_dict[f\"loss_disc_real_{i}\"] = ldr\n        return return_dict\n\n\nclass ForwardTTSLoss(nn.Module):\n    \"\"\"Generic configurable ForwardTTS loss.\"\"\"\n\n    def __init__(self, c):\n        super().__init__()\n        if c.spec_loss_type == \"mse\":\n            self.spec_loss = MSELossMasked(False)\n        elif c.spec_loss_type == \"l1\":\n            self.spec_loss = L1LossMasked(False)\n        else:\n            raise ValueError(\" [!] Unknown spec_loss_type {}\".format(c.spec_loss_type))\n\n        if c.duration_loss_type == \"mse\":\n            self.dur_loss = MSELossMasked(False)\n        elif c.duration_loss_type == \"l1\":\n            self.dur_loss = L1LossMasked(False)\n        elif c.duration_loss_type == \"huber\":\n            self.dur_loss = Huber()\n        else:\n            raise ValueError(\" [!] Unknown duration_loss_type {}\".format(c.duration_loss_type))\n\n        if c.model_args.use_aligner:\n            self.aligner_loss = ForwardSumLoss()\n            self.aligner_loss_alpha = c.aligner_loss_alpha\n\n        if c.model_args.use_pitch:\n            self.pitch_loss = MSELossMasked(False)\n            self.pitch_loss_alpha = c.pitch_loss_alpha\n\n        if c.model_args.use_energy:\n            self.energy_loss = MSELossMasked(False)\n            self.energy_loss_alpha = c.energy_loss_alpha\n\n        if c.use_ssim_loss:\n            self.ssim = SSIMLoss() if c.use_ssim_loss else None\n            self.ssim_loss_alpha = c.ssim_loss_alpha\n\n        self.spec_loss_alpha = c.spec_loss_alpha\n        self.dur_loss_alpha = c.dur_loss_alpha\n        self.binary_alignment_loss_alpha = c.binary_align_loss_alpha\n\n    @staticmethod\n    def _binary_alignment_loss(alignment_hard, alignment_soft):\n        \"\"\"Binary loss that forces soft alignments to match the hard alignments as\n        explained in `https://arxiv.org/pdf/2108.10447.pdf`.\n        \"\"\"\n        log_sum = torch.log(torch.clamp(alignment_soft[alignment_hard == 1], min=1e-12)).sum()\n        return -log_sum / alignment_hard.sum()\n\n    def forward(\n        self,\n        decoder_output,\n        decoder_target,\n        decoder_output_lens,\n        dur_output,\n        dur_target,\n        pitch_output,\n        pitch_target,\n        energy_output,\n        energy_target,\n        input_lens,\n        alignment_logprob=None,\n        alignment_hard=None,\n        alignment_soft=None,\n        binary_loss_weight=None,\n    ):\n        loss = 0\n        return_dict = {}\n        if hasattr(self, \"ssim_loss\") and self.ssim_loss_alpha > 0:\n            ssim_loss = self.ssim(decoder_output, decoder_target, decoder_output_lens)\n            loss = loss + self.ssim_loss_alpha * ssim_loss\n            return_dict[\"loss_ssim\"] = self.ssim_loss_alpha * ssim_loss\n\n        if self.spec_loss_alpha > 0:\n            spec_loss = self.spec_loss(decoder_output, decoder_target, decoder_output_lens)\n            loss = loss + self.spec_loss_alpha * spec_loss\n            return_dict[\"loss_spec\"] = self.spec_loss_alpha * spec_loss\n\n        if self.dur_loss_alpha > 0:\n            log_dur_tgt = torch.log(dur_target.float() + 1)\n            dur_loss = self.dur_loss(dur_output[:, :, None], log_dur_tgt[:, :, None], input_lens)\n            loss = loss + self.dur_loss_alpha * dur_loss\n            return_dict[\"loss_dur\"] = self.dur_loss_alpha * dur_loss\n\n        if hasattr(self, \"pitch_loss\") and self.pitch_loss_alpha > 0:\n            pitch_loss = self.pitch_loss(pitch_output.transpose(1, 2), pitch_target.transpose(1, 2), input_lens)\n            loss = loss + self.pitch_loss_alpha * pitch_loss\n            return_dict[\"loss_pitch\"] = self.pitch_loss_alpha * pitch_loss\n\n        if hasattr(self, \"energy_loss\") and self.energy_loss_alpha > 0:\n            energy_loss = self.energy_loss(energy_output.transpose(1, 2), energy_target.transpose(1, 2), input_lens)\n            loss = loss + self.energy_loss_alpha * energy_loss\n            return_dict[\"loss_energy\"] = self.energy_loss_alpha * energy_loss\n\n        if hasattr(self, \"aligner_loss\") and self.aligner_loss_alpha > 0:\n            aligner_loss = self.aligner_loss(alignment_logprob, input_lens, decoder_output_lens)\n            loss = loss + self.aligner_loss_alpha * aligner_loss\n            return_dict[\"loss_aligner\"] = self.aligner_loss_alpha * aligner_loss\n\n        if self.binary_alignment_loss_alpha > 0 and alignment_hard is not None:\n            binary_alignment_loss = self._binary_alignment_loss(alignment_hard, alignment_soft)\n            loss = loss + self.binary_alignment_loss_alpha * binary_alignment_loss\n            if binary_loss_weight:\n                return_dict[\"loss_binary_alignment\"] = (\n                    self.binary_alignment_loss_alpha * binary_alignment_loss * binary_loss_weight\n                )\n            else:\n                return_dict[\"loss_binary_alignment\"] = self.binary_alignment_loss_alpha * binary_alignment_loss\n\n        return_dict[\"loss\"] = loss\n        return return_dict\n"
  },
  {
    "path": "TTS/tts/layers/overflow/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/layers/overflow/common_layers.py",
    "content": "from typing import List, Tuple\n\nimport torch\nimport torch.nn.functional as F\nfrom torch import nn\nfrom tqdm.auto import tqdm\n\nfrom TTS.tts.layers.tacotron.common_layers import Linear\nfrom TTS.tts.layers.tacotron.tacotron2 import ConvBNBlock\n\n\nclass Encoder(nn.Module):\n    r\"\"\"Neural HMM Encoder\n\n    Same as Tacotron 2 encoder but increases the input length by states per phone\n\n    Args:\n        num_chars (int): Number of characters in the input.\n        state_per_phone (int): Number of states per phone.\n        in_out_channels (int): number of input and output channels.\n        n_convolutions (int): number of convolutional layers.\n    \"\"\"\n\n    def __init__(self, num_chars, state_per_phone, in_out_channels=512, n_convolutions=3):\n        super().__init__()\n\n        self.state_per_phone = state_per_phone\n        self.in_out_channels = in_out_channels\n\n        self.emb = nn.Embedding(num_chars, in_out_channels)\n        self.convolutions = nn.ModuleList()\n        for _ in range(n_convolutions):\n            self.convolutions.append(ConvBNBlock(in_out_channels, in_out_channels, 5, \"relu\"))\n        self.lstm = nn.LSTM(\n            in_out_channels,\n            int(in_out_channels / 2) * state_per_phone,\n            num_layers=1,\n            batch_first=True,\n            bias=True,\n            bidirectional=True,\n        )\n        self.rnn_state = None\n\n    def forward(self, x: torch.FloatTensor, x_len: torch.LongTensor) -> Tuple[torch.FloatTensor, torch.LongTensor]:\n        \"\"\"Forward pass to the encoder.\n\n        Args:\n            x (torch.FloatTensor): input text indices.\n                - shape: :math:`(b, T_{in})`\n            x_len (torch.LongTensor): input text lengths.\n                - shape: :math:`(b,)`\n\n        Returns:\n            Tuple[torch.FloatTensor, torch.LongTensor]: encoder outputs and output lengths.\n                -shape: :math:`((b, T_{in} * states_per_phone, in_out_channels), (b,))`\n        \"\"\"\n        b, T = x.shape\n        o = self.emb(x).transpose(1, 2)\n        for layer in self.convolutions:\n            o = layer(o)\n        o = o.transpose(1, 2)\n        o = nn.utils.rnn.pack_padded_sequence(o, x_len.cpu(), batch_first=True)\n        self.lstm.flatten_parameters()\n        o, _ = self.lstm(o)\n        o, _ = nn.utils.rnn.pad_packed_sequence(o, batch_first=True)\n        o = o.reshape(b, T * self.state_per_phone, self.in_out_channels)\n        x_len = x_len * self.state_per_phone\n        return o, x_len\n\n    def inference(self, x, x_len):\n        \"\"\"Inference to the encoder.\n\n        Args:\n            x (torch.FloatTensor): input text indices.\n                - shape: :math:`(b, T_{in})`\n            x_len (torch.LongTensor): input text lengths.\n                - shape: :math:`(b,)`\n\n        Returns:\n            Tuple[torch.FloatTensor, torch.LongTensor]: encoder outputs and output lengths.\n                -shape: :math:`((b, T_{in} * states_per_phone, in_out_channels), (b,))`\n        \"\"\"\n        b, T = x.shape\n        o = self.emb(x).transpose(1, 2)\n        for layer in self.convolutions:\n            o = layer(o)\n        o = o.transpose(1, 2)\n        # self.lstm.flatten_parameters()\n        o, _ = self.lstm(o)\n        o = o.reshape(b, T * self.state_per_phone, self.in_out_channels)\n        x_len = x_len * self.state_per_phone\n        return o, x_len\n\n\nclass ParameterModel(nn.Module):\n    r\"\"\"Main neural network of the outputnet\n\n    Note: Do not put dropout layers here, the model will not converge.\n\n    Args:\n            outputnet_size (List[int]): the architecture of the parameter model\n            input_size (int): size of input for the first layer\n            output_size (int): size of output i.e size of the feature dim\n            frame_channels (int): feature dim to set the flat start bias\n            flat_start_params (dict): flat start parameters to set the bias\n    \"\"\"\n\n    def __init__(\n        self,\n        outputnet_size: List[int],\n        input_size: int,\n        output_size: int,\n        frame_channels: int,\n        flat_start_params: dict,\n    ):\n        super().__init__()\n        self.frame_channels = frame_channels\n\n        self.layers = nn.ModuleList(\n            [Linear(inp, out) for inp, out in zip([input_size] + outputnet_size[:-1], outputnet_size)]\n        )\n        self.last_layer = nn.Linear(outputnet_size[-1], output_size)\n        self.flat_start_output_layer(\n            flat_start_params[\"mean\"], flat_start_params[\"std\"], flat_start_params[\"transition_p\"]\n        )\n\n    def flat_start_output_layer(self, mean, std, transition_p):\n        self.last_layer.weight.data.zero_()\n        self.last_layer.bias.data[0 : self.frame_channels] = mean\n        self.last_layer.bias.data[self.frame_channels : 2 * self.frame_channels] = OverflowUtils.inverse_softplus(std)\n        self.last_layer.bias.data[2 * self.frame_channels :] = OverflowUtils.inverse_sigmod(transition_p)\n\n    def forward(self, x):\n        for layer in self.layers:\n            x = F.relu(layer(x))\n        x = self.last_layer(x)\n        return x\n\n\nclass Outputnet(nn.Module):\n    r\"\"\"\n    This network takes current state and previous observed values as input\n    and returns its parameters, mean, standard deviation and probability\n    of transition to the next state\n    \"\"\"\n\n    def __init__(\n        self,\n        encoder_dim: int,\n        memory_rnn_dim: int,\n        frame_channels: int,\n        outputnet_size: List[int],\n        flat_start_params: dict,\n        std_floor: float = 1e-2,\n    ):\n        super().__init__()\n\n        self.frame_channels = frame_channels\n        self.flat_start_params = flat_start_params\n        self.std_floor = std_floor\n\n        input_size = memory_rnn_dim + encoder_dim\n        output_size = 2 * frame_channels + 1\n\n        self.parametermodel = ParameterModel(\n            outputnet_size=outputnet_size,\n            input_size=input_size,\n            output_size=output_size,\n            flat_start_params=flat_start_params,\n            frame_channels=frame_channels,\n        )\n\n    def forward(self, ar_mels, inputs):\n        r\"\"\"Inputs observation and returns the means, stds and transition probability for the current state\n\n        Args:\n            ar_mel_inputs (torch.FloatTensor): shape (batch, prenet_dim)\n            states (torch.FloatTensor):  (batch, hidden_states, hidden_state_dim)\n\n        Returns:\n            means: means for the emission observation for each feature\n                - shape: (B, hidden_states, feature_size)\n            stds: standard deviations for the emission observation for each feature\n                - shape: (batch, hidden_states, feature_size)\n            transition_vectors: transition vector for the current hidden state\n                - shape: (batch, hidden_states)\n        \"\"\"\n        batch_size, prenet_dim = ar_mels.shape[0], ar_mels.shape[1]\n        N = inputs.shape[1]\n\n        ar_mels = ar_mels.unsqueeze(1).expand(batch_size, N, prenet_dim)\n        ar_mels = torch.cat((ar_mels, inputs), dim=2)\n        ar_mels = self.parametermodel(ar_mels)\n\n        mean, std, transition_vector = (\n            ar_mels[:, :, 0 : self.frame_channels],\n            ar_mels[:, :, self.frame_channels : 2 * self.frame_channels],\n            ar_mels[:, :, 2 * self.frame_channels :].squeeze(2),\n        )\n        std = F.softplus(std)\n        std = self._floor_std(std)\n        return mean, std, transition_vector\n\n    def _floor_std(self, std):\n        r\"\"\"\n        It clamps the standard deviation to not to go below some level\n        This removes the problem when the model tries to cheat for higher likelihoods by converting\n        one of the gaussians to a point mass.\n\n        Args:\n            std (float Tensor): tensor containing the standard deviation to be\n        \"\"\"\n        original_tensor = std.clone().detach()\n        std = torch.clamp(std, min=self.std_floor)\n        if torch.any(original_tensor != std):\n            print(\n                \"[*] Standard deviation was floored! The model is preventing overfitting, nothing serious to worry about\"\n            )\n        return std\n\n\nclass OverflowUtils:\n    @staticmethod\n    def get_data_parameters_for_flat_start(\n        data_loader: torch.utils.data.DataLoader, out_channels: int, states_per_phone: int\n    ):\n        \"\"\"Generates data parameters for flat starting the HMM.\n\n        Args:\n            data_loader (torch.utils.data.Dataloader): _description_\n            out_channels (int): mel spectrogram channels\n            states_per_phone (_type_): HMM states per phone\n        \"\"\"\n\n        # State related information for transition_p\n        total_state_len = 0\n        total_mel_len = 0\n\n        # Useful for data mean an std\n        total_mel_sum = 0\n        total_mel_sq_sum = 0\n\n        for batch in tqdm(data_loader, leave=False):\n            text_lengths = batch[\"token_id_lengths\"]\n            mels = batch[\"mel\"]\n            mel_lengths = batch[\"mel_lengths\"]\n\n            total_state_len += torch.sum(text_lengths)\n            total_mel_len += torch.sum(mel_lengths)\n            total_mel_sum += torch.sum(mels)\n            total_mel_sq_sum += torch.sum(torch.pow(mels, 2))\n\n        data_mean = total_mel_sum / (total_mel_len * out_channels)\n        data_std = torch.sqrt((total_mel_sq_sum / (total_mel_len * out_channels)) - torch.pow(data_mean, 2))\n        average_num_states = total_state_len / len(data_loader.dataset)\n        average_mel_len = total_mel_len / len(data_loader.dataset)\n        average_duration_each_state = average_mel_len / average_num_states\n        init_transition_prob = 1 / average_duration_each_state\n\n        return data_mean, data_std, (init_transition_prob * states_per_phone)\n\n    @staticmethod\n    @torch.no_grad()\n    def update_flat_start_transition(model, transition_p):\n        model.neural_hmm.output_net.parametermodel.flat_start_output_layer(0.0, 1.0, transition_p)\n\n    @staticmethod\n    def log_clamped(x, eps=1e-04):\n        \"\"\"\n        Avoids the log(0) problem\n\n        Args:\n            x (torch.tensor): input tensor\n            eps (float, optional): lower bound. Defaults to 1e-04.\n\n        Returns:\n            torch.tensor: :math:`log(x)`\n        \"\"\"\n        clamped_x = torch.clamp(x, min=eps)\n        return torch.log(clamped_x)\n\n    @staticmethod\n    def inverse_sigmod(x):\n        r\"\"\"\n        Inverse of the sigmoid function\n        \"\"\"\n        if not torch.is_tensor(x):\n            x = torch.tensor(x)\n        return OverflowUtils.log_clamped(x / (1.0 - x))\n\n    @staticmethod\n    def inverse_softplus(x):\n        r\"\"\"\n        Inverse of the softplus function\n        \"\"\"\n        if not torch.is_tensor(x):\n            x = torch.tensor(x)\n        return OverflowUtils.log_clamped(torch.exp(x) - 1.0)\n\n    @staticmethod\n    def logsumexp(x, dim):\n        r\"\"\"\n        Differentiable LogSumExp: Does not creates nan gradients\n            when all the inputs are -inf yeilds 0 gradients.\n        Args:\n            x : torch.Tensor -  The input tensor\n            dim: int - The dimension on which the log sum exp has to be applied\n        \"\"\"\n\n        m, _ = x.max(dim=dim)\n        mask = m == -float(\"inf\")\n        s = (x - m.masked_fill_(mask, 0).unsqueeze(dim=dim)).exp().sum(dim=dim)\n        return s.masked_fill_(mask, 1).log() + m.masked_fill_(mask, -float(\"inf\"))\n\n    @staticmethod\n    def double_pad(list_of_different_shape_tensors):\n        r\"\"\"\n        Pads the list of tensors in 2 dimensions\n        \"\"\"\n        second_dim_lens = [len(a) for a in [i[0] for i in list_of_different_shape_tensors]]\n        second_dim_max = max(second_dim_lens)\n        padded_x = [F.pad(x, (0, second_dim_max - len(x[0]))) for x in list_of_different_shape_tensors]\n        return nn.utils.rnn.pad_sequence(padded_x, batch_first=True)\n"
  },
  {
    "path": "TTS/tts/layers/overflow/decoder.py",
    "content": "import torch\nfrom torch import nn\n\nfrom TTS.tts.layers.glow_tts.decoder import Decoder as GlowDecoder\nfrom TTS.tts.utils.helpers import sequence_mask\n\n\nclass Decoder(nn.Module):\n    \"\"\"Uses glow decoder with some modifications.\n    ::\n\n        Squeeze -> ActNorm -> InvertibleConv1x1 -> AffineCoupling -> Unsqueeze\n\n    Args:\n        in_channels (int): channels of input tensor.\n        hidden_channels (int): hidden decoder channels.\n        kernel_size (int): Coupling block kernel size. (Wavenet filter kernel size.)\n        dilation_rate (int): rate to increase dilation by each layer in a decoder block.\n        num_flow_blocks (int): number of decoder blocks.\n        num_coupling_layers (int): number coupling layers. (number of wavenet layers.)\n        dropout_p (float): wavenet dropout rate.\n        sigmoid_scale (bool): enable/disable sigmoid scaling in coupling layer.\n    \"\"\"\n\n    def __init__(\n        self,\n        in_channels,\n        hidden_channels,\n        kernel_size,\n        dilation_rate,\n        num_flow_blocks,\n        num_coupling_layers,\n        dropout_p=0.0,\n        num_splits=4,\n        num_squeeze=2,\n        sigmoid_scale=False,\n        c_in_channels=0,\n    ):\n        super().__init__()\n\n        self.glow_decoder = GlowDecoder(\n            in_channels,\n            hidden_channels,\n            kernel_size,\n            dilation_rate,\n            num_flow_blocks,\n            num_coupling_layers,\n            dropout_p,\n            num_splits,\n            num_squeeze,\n            sigmoid_scale,\n            c_in_channels,\n        )\n        self.n_sqz = num_squeeze\n\n    def forward(self, x, x_len, g=None, reverse=False):\n        \"\"\"\n        Input shapes:\n            - x:  :math:`[B, C, T]`\n            - x_len :math:`[B]`\n            - g: :math:`[B, C]`\n\n        Output shapes:\n            - x:  :math:`[B, C, T]`\n            - x_len :math:`[B]`\n            - logget_tot :math:`[B]`\n        \"\"\"\n        x, x_len, x_max_len = self.preprocess(x, x_len, x_len.max())\n        x_mask = torch.unsqueeze(sequence_mask(x_len, x_max_len), 1).to(x.dtype)\n        x, logdet_tot = self.glow_decoder(x, x_mask, g, reverse)\n        return x, x_len, logdet_tot\n\n    def preprocess(self, y, y_lengths, y_max_length):\n        if y_max_length is not None:\n            y_max_length = torch.div(y_max_length, self.n_sqz, rounding_mode=\"floor\") * self.n_sqz\n            y = y[:, :, :y_max_length]\n        y_lengths = torch.div(y_lengths, self.n_sqz, rounding_mode=\"floor\") * self.n_sqz\n        return y, y_lengths, y_max_length\n\n    def store_inverse(self):\n        self.glow_decoder.store_inverse()\n"
  },
  {
    "path": "TTS/tts/layers/overflow/neural_hmm.py",
    "content": "from typing import List\n\nimport torch\nimport torch.distributions as tdist\nimport torch.nn.functional as F\nfrom torch import nn\nfrom torch.utils.checkpoint import checkpoint\n\nfrom TTS.tts.layers.overflow.common_layers import Outputnet, OverflowUtils\nfrom TTS.tts.layers.tacotron.common_layers import Prenet\nfrom TTS.tts.utils.helpers import sequence_mask\n\n\nclass NeuralHMM(nn.Module):\n    \"\"\"Autoregressive left to right HMM model primarily used in \"Neural HMMs are all you need (for high-quality attention-free TTS)\"\n\n    Paper::\n        https://arxiv.org/abs/2108.13320\n\n    Paper abstract::\n        Neural sequence-to-sequence TTS has achieved significantly better output quality than statistical speech synthesis using\n        HMMs. However, neural TTS is generally not probabilistic and uses non-monotonic attention. Attention failures increase\n        training time and can make synthesis babble incoherently. This paper describes how the old and new paradigms can be\n        combined to obtain the advantages of both worlds, by replacing attention in neural TTS with an autoregressive left-right\n        no-skip hidden Markov model defined by a neural network. Based on this proposal, we modify Tacotron 2 to obtain an\n        HMM-based neural TTS model with monotonic alignment, trained to maximise the full sequence likelihood without\n        approximation. We also describe how to combine ideas from classical and contemporary TTS for best results. The resulting\n        example system is smaller and simpler than Tacotron 2, and learns to speak with fewer iterations and less data, whilst\n        achieving comparable naturalness prior to the post-net. Our approach also allows easy control over speaking rate.\n\n    Args:\n        frame_channels (int): Output dimension to generate.\n        ar_order (int): Autoregressive order of the model. In ablations of Neural HMM it was found that more autoregression while giving more variation hurts naturalness of the synthesised audio.\n        deterministic_transition (bool): deterministic duration generation based on duration quantiles as defiend in \"S. Ronanki, O. Watts, S. King, and G. E. Henter, “Medianbased generation of synthetic speech durations using a nonparametric approach,” in Proc. SLT, 2016.\". Defaults to True.\n        encoder_dim (int): Channels of encoder input and character embedding tensors. Defaults to 512.\n        prenet_type (str): `original` or `bn`. `original` sets the default Prenet and `bn` uses Batch Normalization version of the Prenet.\n        prenet_dim (int): Dimension of the Prenet.\n        prenet_n_layers (int): Number of layers in the Prenet.\n        prenet_dropout (float): Dropout probability of the Prenet.\n        prenet_dropout_at_inference (bool): If True, dropout is applied at inference time.\n        memory_rnn_dim (int): Size of the memory RNN to process output of prenet.\n        outputnet_size (List[int]): Size of the output network inside the neural HMM.\n        flat_start_params (dict): Parameters for the flat start initialization of the neural HMM.\n        std_floor (float): Floor value for the standard deviation of the neural HMM. Prevents model cheating by putting point mass and getting infinite likelihood at any datapoint.\n        use_grad_checkpointing (bool, optional): Use gradient checkpointing to save memory. Defaults to True.\n    \"\"\"\n\n    def __init__(\n        self,\n        frame_channels: int,\n        ar_order: int,\n        deterministic_transition: bool,\n        encoder_dim: int,\n        prenet_type: str,\n        prenet_dim: int,\n        prenet_n_layers: int,\n        prenet_dropout: float,\n        prenet_dropout_at_inference: bool,\n        memory_rnn_dim: int,\n        outputnet_size: List[int],\n        flat_start_params: dict,\n        std_floor: float,\n        use_grad_checkpointing: bool = True,\n    ):\n        super().__init__()\n\n        self.frame_channels = frame_channels\n        self.ar_order = ar_order\n        self.deterministic_transition = deterministic_transition\n        self.prenet_dim = prenet_dim\n        self.memory_rnn_dim = memory_rnn_dim\n        self.use_grad_checkpointing = use_grad_checkpointing\n\n        self.transition_model = TransitionModel()\n        self.emission_model = EmissionModel()\n\n        assert ar_order > 0, f\"AR order must be greater than 0 provided {ar_order}\"\n\n        self.ar_order = ar_order\n        self.prenet = Prenet(\n            in_features=frame_channels * ar_order,\n            prenet_type=prenet_type,\n            prenet_dropout=prenet_dropout,\n            dropout_at_inference=prenet_dropout_at_inference,\n            out_features=[self.prenet_dim for _ in range(prenet_n_layers)],\n            bias=False,\n        )\n        self.memory_rnn = nn.LSTMCell(input_size=prenet_dim, hidden_size=memory_rnn_dim)\n        self.output_net = Outputnet(\n            encoder_dim, memory_rnn_dim, frame_channels, outputnet_size, flat_start_params, std_floor\n        )\n        self.register_buffer(\"go_tokens\", torch.zeros(ar_order, 1))\n\n    def forward(self, inputs, inputs_len, mels, mel_lens):\n        r\"\"\"HMM forward algorithm for training uses logarithmic version of Rabiner (1989) forward algorithm.\n\n        Args:\n            inputs (torch.FloatTensor): Encoder outputs\n            inputs_len (torch.LongTensor): Encoder output lengths\n            mels (torch.FloatTensor): Mel inputs\n            mel_lens (torch.LongTensor): Length of mel inputs\n\n        Shapes:\n            - inputs: (B, T, D_out_enc)\n            - inputs_len: (B)\n            - mels: (B, D_mel, T_mel)\n            - mel_lens: (B)\n\n        Returns:\n            log_prob (torch.FloatTensor): Log probability of the sequence\n        \"\"\"\n        # Get dimensions of inputs\n        batch_size, N, _ = inputs.shape\n        T_max = torch.max(mel_lens)\n        mels = mels.permute(0, 2, 1)\n\n        # Intialize forward algorithm\n        log_state_priors = self._initialize_log_state_priors(inputs)\n        log_c, log_alpha_scaled, transition_matrix, means = self._initialize_forward_algorithm_variables(mels, N)\n\n        # Initialize autoregression elements\n        ar_inputs = self._add_go_token(mels)\n        h_memory, c_memory = self._init_lstm_states(batch_size, self.memory_rnn_dim, mels)\n\n        for t in range(T_max):\n            # Process Autoregression\n            h_memory, c_memory = self._process_ar_timestep(t, ar_inputs, h_memory, c_memory)\n            # Get mean, std and transition vector from decoder for this timestep\n            # Note: Gradient checkpointing currently doesn't works with multiple gpus inside a loop\n            if self.use_grad_checkpointing and self.training:\n                mean, std, transition_vector = checkpoint(self.output_net, h_memory, inputs)\n            else:\n                mean, std, transition_vector = self.output_net(h_memory, inputs)\n\n            if t == 0:\n                log_alpha_temp = log_state_priors + self.emission_model(mels[:, 0], mean, std, inputs_len)\n            else:\n                log_alpha_temp = self.emission_model(mels[:, t], mean, std, inputs_len) + self.transition_model(\n                    log_alpha_scaled[:, t - 1, :], transition_vector, inputs_len\n                )\n            log_c[:, t] = torch.logsumexp(log_alpha_temp, dim=1)\n            log_alpha_scaled[:, t, :] = log_alpha_temp - log_c[:, t].unsqueeze(1)\n            transition_matrix[:, t] = transition_vector  # needed for absorption state calculation\n\n            # Save for plotting\n            means.append(mean.detach())\n\n        log_c, log_alpha_scaled = self._mask_lengths(mel_lens, log_c, log_alpha_scaled)\n\n        sum_final_log_c = self.get_absorption_state_scaling_factor(\n            mel_lens, log_alpha_scaled, inputs_len, transition_matrix\n        )\n\n        log_probs = torch.sum(log_c, dim=1) + sum_final_log_c\n\n        return log_probs, log_alpha_scaled, transition_matrix, means\n\n    @staticmethod\n    def _mask_lengths(mel_lens, log_c, log_alpha_scaled):\n        \"\"\"\n        Mask the lengths of the forward variables so that the variable lenghts\n        do not contribute in the loss calculation\n        Args:\n            mel_inputs (torch.FloatTensor): (batch, T, frame_channels)\n            mel_inputs_lengths (torch.IntTensor): (batch)\n            log_c (torch.FloatTensor): (batch, T)\n        Returns:\n            log_c (torch.FloatTensor) : scaled probabilities (batch, T)\n            log_alpha_scaled (torch.FloatTensor): forward probabilities (batch, T, N)\n        \"\"\"\n        mask_log_c = sequence_mask(mel_lens)\n        log_c = log_c * mask_log_c\n        mask_log_alpha_scaled = mask_log_c.unsqueeze(2)\n        log_alpha_scaled = log_alpha_scaled * mask_log_alpha_scaled\n        return log_c, log_alpha_scaled\n\n    def _process_ar_timestep(\n        self,\n        t,\n        ar_inputs,\n        h_memory,\n        c_memory,\n    ):\n        \"\"\"\n        Process autoregression in timestep\n        1. At a specific t timestep\n        2. Perform data dropout if applied (we did not use it)\n        3. Run the autoregressive frame through the prenet (has dropout)\n        4. Run the prenet output through the post prenet rnn\n\n        Args:\n            t (int): mel-spec timestep\n            ar_inputs (torch.FloatTensor): go-token appended mel-spectrograms\n                - shape: (b, D_out, T_out)\n            h_post_prenet (torch.FloatTensor): previous timestep rnn hidden state\n                - shape: (b, memory_rnn_dim)\n            c_post_prenet (torch.FloatTensor): previous timestep rnn cell state\n                - shape: (b, memory_rnn_dim)\n\n        Returns:\n            h_post_prenet (torch.FloatTensor): rnn hidden state of the current timestep\n            c_post_prenet (torch.FloatTensor): rnn cell state of the current timestep\n        \"\"\"\n        prenet_input = ar_inputs[:, t : t + self.ar_order].flatten(1)\n        memory_inputs = self.prenet(prenet_input)\n        h_memory, c_memory = self.memory_rnn(memory_inputs, (h_memory, c_memory))\n        return h_memory, c_memory\n\n    def _add_go_token(self, mel_inputs):\n        \"\"\"Append the go token to create the autoregressive input\n        Args:\n            mel_inputs (torch.FloatTensor): (batch_size, T, n_mel_channel)\n        Returns:\n            ar_inputs (torch.FloatTensor): (batch_size, T, n_mel_channel)\n        \"\"\"\n        batch_size, T, _ = mel_inputs.shape\n        go_tokens = self.go_tokens.unsqueeze(0).expand(batch_size, self.ar_order, self.frame_channels)\n        ar_inputs = torch.cat((go_tokens, mel_inputs), dim=1)[:, :T]\n        return ar_inputs\n\n    @staticmethod\n    def _initialize_forward_algorithm_variables(mel_inputs, N):\n        r\"\"\"Initialize placeholders for forward algorithm variables, to use a stable\n                version we will use log_alpha_scaled and the scaling constant\n\n        Args:\n            mel_inputs (torch.FloatTensor): (b, T_max, frame_channels)\n            N (int): number of states\n        Returns:\n            log_c (torch.FloatTensor): Scaling constant (b, T_max)\n        \"\"\"\n        b, T_max, _ = mel_inputs.shape\n        log_alpha_scaled = mel_inputs.new_zeros((b, T_max, N))\n        log_c = mel_inputs.new_zeros(b, T_max)\n        transition_matrix = mel_inputs.new_zeros((b, T_max, N))\n\n        # Saving for plotting later, will not have gradient tapes\n        means = []\n        return log_c, log_alpha_scaled, transition_matrix, means\n\n    @staticmethod\n    def _init_lstm_states(batch_size, hidden_state_dim, device_tensor):\n        r\"\"\"\n        Initialize Hidden and Cell states for LSTM Cell\n\n        Args:\n            batch_size (Int): batch size\n            hidden_state_dim (Int): dimensions of the h and c\n            device_tensor (torch.FloatTensor): useful for the device and type\n\n        Returns:\n            (torch.FloatTensor): shape (batch_size, hidden_state_dim)\n                can be hidden state for LSTM\n            (torch.FloatTensor): shape (batch_size, hidden_state_dim)\n                can be the cell state for LSTM\n        \"\"\"\n        return (\n            device_tensor.new_zeros(batch_size, hidden_state_dim),\n            device_tensor.new_zeros(batch_size, hidden_state_dim),\n        )\n\n    def get_absorption_state_scaling_factor(self, mels_len, log_alpha_scaled, inputs_len, transition_vector):\n        \"\"\"Returns the final scaling factor of absorption state\n\n        Args:\n            mels_len (torch.IntTensor): Input size of mels to\n                    get the last timestep of log_alpha_scaled\n            log_alpha_scaled (torch.FloatTEnsor): State probabilities\n            text_lengths (torch.IntTensor): length of the states to\n                    mask the values of states lengths\n                (\n                    Useful when the batch has very different lengths,\n                    when the length of an observation is less than\n                    the number of max states, then the log alpha after\n                    the state value is filled with -infs. So we mask\n                    those values so that it only consider the states\n                    which are needed for that length\n                )\n            transition_vector (torch.FloatTensor): transtiion vector for each state per timestep\n\n        Shapes:\n            - mels_len: (batch_size)\n            - log_alpha_scaled: (batch_size, N, T)\n            - text_lengths: (batch_size)\n            - transition_vector: (batch_size, N, T)\n\n        Returns:\n            sum_final_log_c (torch.FloatTensor): (batch_size)\n\n        \"\"\"\n        N = torch.max(inputs_len)\n        max_inputs_len = log_alpha_scaled.shape[2]\n        state_lengths_mask = sequence_mask(inputs_len, max_len=max_inputs_len)\n\n        last_log_alpha_scaled_index = (\n            (mels_len - 1).unsqueeze(-1).expand(-1, N).unsqueeze(1)\n        )  # Batch X Hidden State Size\n        last_log_alpha_scaled = torch.gather(log_alpha_scaled, 1, last_log_alpha_scaled_index).squeeze(1)\n        last_log_alpha_scaled = last_log_alpha_scaled.masked_fill(~state_lengths_mask, -float(\"inf\"))\n\n        last_transition_vector = torch.gather(transition_vector, 1, last_log_alpha_scaled_index).squeeze(1)\n        last_transition_probability = torch.sigmoid(last_transition_vector)\n        log_probability_of_transitioning = OverflowUtils.log_clamped(last_transition_probability)\n\n        last_transition_probability_index = self.get_mask_for_last_item(inputs_len, inputs_len.device)\n        log_probability_of_transitioning = log_probability_of_transitioning.masked_fill(\n            ~last_transition_probability_index, -float(\"inf\")\n        )\n        final_log_c = last_log_alpha_scaled + log_probability_of_transitioning\n\n        # If the length of the mel is less than the number of states it will select the -inf values leading to nan gradients\n        # Ideally, we should clean the dataset otherwise this is a little hack uncomment the line below\n        final_log_c = final_log_c.clamp(min=torch.finfo(final_log_c.dtype).min)\n\n        sum_final_log_c = torch.logsumexp(final_log_c, dim=1)\n        return sum_final_log_c\n\n    @staticmethod\n    def get_mask_for_last_item(lengths, device, out_tensor=None):\n        \"\"\"Returns n-1 mask for the last item in the sequence.\n\n        Args:\n            lengths (torch.IntTensor): lengths in a batch\n            device (str, optional): Defaults to \"cpu\".\n            out_tensor (torch.Tensor, optional): uses the memory of a specific tensor.\n                Defaults to None.\n\n        Returns:\n            - Shape: :math:`(b, max_len)`\n        \"\"\"\n        max_len = torch.max(lengths).item()\n        ids = (\n            torch.arange(0, max_len, device=device) if out_tensor is None else torch.arange(0, max_len, out=out_tensor)\n        )\n        mask = ids == lengths.unsqueeze(1) - 1\n        return mask\n\n    @torch.inference_mode()\n    def inference(\n        self,\n        inputs: torch.FloatTensor,\n        input_lens: torch.LongTensor,\n        sampling_temp: float,\n        max_sampling_time: int,\n        duration_threshold: float,\n    ):\n        \"\"\"Inference from autoregressive neural HMM\n\n        Args:\n            inputs (torch.FloatTensor): input states\n                - shape: :math:`(b, T, d)`\n            input_lens (torch.LongTensor): input state lengths\n                - shape: :math:`(b)`\n            sampling_temp (float): sampling temperature\n            max_sampling_temp (int): max sampling temperature\n            duration_threshold (float): duration threshold to switch to next state\n                - Use this to change the spearking rate of the synthesised audio\n        \"\"\"\n\n        b = inputs.shape[0]\n        outputs = {\n            \"hmm_outputs\": [],\n            \"hmm_outputs_len\": [],\n            \"alignments\": [],\n            \"input_parameters\": [],\n            \"output_parameters\": [],\n        }\n        for i in range(b):\n            neural_hmm_outputs, states_travelled, input_parameters, output_parameters = self.sample(\n                inputs[i : i + 1], input_lens[i], sampling_temp, max_sampling_time, duration_threshold\n            )\n\n            outputs[\"hmm_outputs\"].append(neural_hmm_outputs)\n            outputs[\"hmm_outputs_len\"].append(neural_hmm_outputs.shape[0])\n            outputs[\"alignments\"].append(states_travelled)\n            outputs[\"input_parameters\"].append(input_parameters)\n            outputs[\"output_parameters\"].append(output_parameters)\n\n        outputs[\"hmm_outputs\"] = nn.utils.rnn.pad_sequence(outputs[\"hmm_outputs\"], batch_first=True)\n        outputs[\"hmm_outputs_len\"] = torch.tensor(\n            outputs[\"hmm_outputs_len\"], dtype=input_lens.dtype, device=input_lens.device\n        )\n        return outputs\n\n    @torch.inference_mode()\n    def sample(self, inputs, input_lens, sampling_temp, max_sampling_time, duration_threshold):\n        \"\"\"Samples an output from the parameter models\n\n        Args:\n            inputs (torch.FloatTensor): input states\n                - shape: :math:`(1, T, d)`\n            input_lens (torch.LongTensor): input state lengths\n                - shape: :math:`(1)`\n            sampling_temp (float): sampling temperature\n            max_sampling_time (int): max sampling time\n            duration_threshold (float): duration threshold to switch to next state\n\n        Returns:\n            outputs (torch.FloatTensor): Output Observations\n                - Shape: :math:`(T, output_dim)`\n            states_travelled (list[int]): Hidden states travelled\n                - Shape: :math:`(T)`\n            input_parameters (list[torch.FloatTensor]): Input parameters\n            output_parameters (list[torch.FloatTensor]): Output parameters\n        \"\"\"\n        states_travelled, outputs, t = [], [], 0\n\n        # Sample initial state\n        current_state = 0\n        states_travelled.append(current_state)\n\n        # Prepare autoregression\n        prenet_input = self.go_tokens.unsqueeze(0).expand(1, self.ar_order, self.frame_channels)\n        h_memory, c_memory = self._init_lstm_states(1, self.memory_rnn_dim, prenet_input)\n\n        input_parameter_values = []\n        output_parameter_values = []\n        quantile = 1\n        while True:\n            memory_input = self.prenet(prenet_input.flatten(1).unsqueeze(0))\n            # will be 1 while sampling\n            h_memory, c_memory = self.memory_rnn(memory_input.squeeze(0), (h_memory, c_memory))\n\n            z_t = inputs[:, current_state].unsqueeze(0)  # Add fake time dimension\n            mean, std, transition_vector = self.output_net(h_memory, z_t)\n\n            transition_probability = torch.sigmoid(transition_vector.flatten())\n            staying_probability = torch.sigmoid(-transition_vector.flatten())\n\n            # Save for plotting\n            input_parameter_values.append([prenet_input, current_state])\n            output_parameter_values.append([mean, std, transition_probability])\n\n            x_t = self.emission_model.sample(mean, std, sampling_temp=sampling_temp)\n\n            # Prepare autoregressive input for next iteration\n            prenet_input = torch.cat((prenet_input, x_t), dim=1)[:, 1:]\n\n            outputs.append(x_t.flatten())\n\n            transition_matrix = torch.cat((staying_probability, transition_probability))\n            quantile *= staying_probability\n            if not self.deterministic_transition:\n                switch = transition_matrix.multinomial(1)[0].item()\n            else:\n                switch = quantile < duration_threshold\n\n            if switch:\n                current_state += 1\n                quantile = 1\n\n            states_travelled.append(current_state)\n\n            if (current_state == input_lens) or (max_sampling_time and t == max_sampling_time - 1):\n                break\n\n            t += 1\n\n        return (\n            torch.stack(outputs, dim=0),\n            F.one_hot(input_lens.new_tensor(states_travelled)),\n            input_parameter_values,\n            output_parameter_values,\n        )\n\n    @staticmethod\n    def _initialize_log_state_priors(text_embeddings):\n        \"\"\"Creates the log pi in forward algorithm.\n\n        Args:\n            text_embeddings (torch.FloatTensor): used to create the log pi\n                    on current device\n\n        Shapes:\n            - text_embeddings: (B, T, D_out_enc)\n        \"\"\"\n        N = text_embeddings.shape[1]\n        log_state_priors = text_embeddings.new_full([N], -float(\"inf\"))\n        log_state_priors[0] = 0.0\n        return log_state_priors\n\n\nclass TransitionModel(nn.Module):\n    \"\"\"Transition Model of the HMM, it represents the probability of transitioning\n    form current state to all other states\"\"\"\n\n    def forward(self, log_alpha_scaled, transition_vector, inputs_len):  # pylint: disable=no-self-use\n        r\"\"\"\n        product of the past state with transitional probabilities in log space\n\n        Args:\n            log_alpha_scaled (torch.Tensor): Multiply previous timestep's alphas by\n                        transition matrix (in log domain)\n                - shape: (batch size, N)\n            transition_vector (torch.tensor): transition vector for each state\n                - shape: (N)\n            inputs_len (int tensor): Lengths of states in a batch\n                - shape: (batch)\n\n        Returns:\n            out (torch.FloatTensor): log probability of transitioning to each state\n        \"\"\"\n        transition_p = torch.sigmoid(transition_vector)\n        staying_p = torch.sigmoid(-transition_vector)\n\n        log_staying_probability = OverflowUtils.log_clamped(staying_p)\n        log_transition_probability = OverflowUtils.log_clamped(transition_p)\n\n        staying = log_alpha_scaled + log_staying_probability\n        leaving = log_alpha_scaled + log_transition_probability\n        leaving = leaving.roll(1, dims=1)\n        leaving[:, 0] = -float(\"inf\")\n        inputs_len_mask = sequence_mask(inputs_len)\n        out = OverflowUtils.logsumexp(torch.stack((staying, leaving), dim=2), dim=2)\n        out = out.masked_fill(~inputs_len_mask, -float(\"inf\"))  # There are no states to contribute to the loss\n        return out\n\n\nclass EmissionModel(nn.Module):\n    \"\"\"Emission Model of the HMM, it represents the probability of\n    emitting an observation based on the current state\"\"\"\n\n    def __init__(self) -> None:\n        super().__init__()\n        self.distribution_function: tdist.Distribution = tdist.normal.Normal\n\n    def sample(self, means, stds, sampling_temp):\n        return self.distribution_function(means, stds * sampling_temp).sample() if sampling_temp > 0 else means\n\n    def forward(self, x_t, means, stds, state_lengths):\n        r\"\"\"Calculates the log probability of the the given data (x_t)\n            being observed from states with given means and stds\n        Args:\n            x_t (float tensor) : observation at current time step\n                - shape: (batch, feature_dim)\n            means (float tensor): means of the distributions of hidden states\n                - shape: (batch, hidden_state, feature_dim)\n            stds (float tensor): standard deviations of the distributions of the hidden states\n                - shape: (batch, hidden_state, feature_dim)\n            state_lengths (int tensor): Lengths of states in a batch\n                - shape: (batch)\n\n        Returns:\n            out (float tensor): observation log likelihoods,\n                                    expressing the probability of an observation\n                being generated from a state i\n                shape: (batch, hidden_state)\n        \"\"\"\n        emission_dists = self.distribution_function(means, stds)\n        out = emission_dists.log_prob(x_t.unsqueeze(1))\n        state_lengths_mask = sequence_mask(state_lengths).unsqueeze(2)\n        out = torch.sum(out * state_lengths_mask, dim=2)\n        return out\n"
  },
  {
    "path": "TTS/tts/layers/overflow/plotting_utils.py",
    "content": "from typing import Any\n\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport torch\n\n\ndef validate_numpy_array(value: Any):\n    r\"\"\"\n    Validates the input and makes sure it returns a numpy array (i.e on CPU)\n\n    Args:\n        value (Any): the input value\n\n    Raises:\n        TypeError: if the value is not a numpy array or torch tensor\n\n    Returns:\n        np.ndarray: numpy array of the value\n    \"\"\"\n    if isinstance(value, np.ndarray):\n        pass\n    elif isinstance(value, list):\n        value = np.array(value)\n    elif torch.is_tensor(value):\n        value = value.cpu().numpy()\n    else:\n        raise TypeError(\"Value must be a numpy array, a torch tensor or a list\")\n\n    return value\n\n\ndef get_spec_from_most_probable_state(log_alpha_scaled, means, decoder=None):\n    \"\"\"Get the most probable state means from the log_alpha_scaled.\n\n    Args:\n        log_alpha_scaled (torch.Tensor): Log alpha scaled values.\n            - Shape: :math:`(T, N)`\n        means (torch.Tensor): Means of the states.\n            - Shape: :math:`(N, T, D_out)`\n        decoder (torch.nn.Module): Decoder module to decode the latent to melspectrogram. Defaults to None.\n    \"\"\"\n    max_state_numbers = torch.max(log_alpha_scaled, dim=1)[1]\n    max_len = means.shape[0]\n    n_mel_channels = means.shape[2]\n    max_state_numbers = max_state_numbers.unsqueeze(1).unsqueeze(1).expand(max_len, 1, n_mel_channels)\n    means = torch.gather(means, 1, max_state_numbers).squeeze(1).to(log_alpha_scaled.dtype)\n    if decoder is not None:\n        mel = (\n            decoder(means.T.unsqueeze(0), torch.tensor([means.shape[0]], device=means.device), reverse=True)[0]\n            .squeeze(0)\n            .T\n        )\n    else:\n        mel = means\n    return mel\n\n\ndef plot_transition_probabilities_to_numpy(states, transition_probabilities, output_fig=False):\n    \"\"\"Generates trainsition probabilities plot for the states and the probability of transition.\n\n    Args:\n        states (torch.IntTensor): the states\n        transition_probabilities (torch.FloatTensor): the transition probabilities\n    \"\"\"\n    states = validate_numpy_array(states)\n    transition_probabilities = validate_numpy_array(transition_probabilities)\n\n    fig, ax = plt.subplots(figsize=(30, 3))\n    ax.plot(transition_probabilities, \"o\")\n    ax.set_title(\"Transition probability of state\")\n    ax.set_xlabel(\"hidden state\")\n    ax.set_ylabel(\"probability\")\n    ax.set_xticks([i for i in range(len(transition_probabilities))])  # pylint: disable=unnecessary-comprehension\n    ax.set_xticklabels([int(x) for x in states], rotation=90)\n    plt.tight_layout()\n    if not output_fig:\n        plt.close()\n    return fig\n"
  },
  {
    "path": "TTS/tts/layers/tacotron/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/layers/tacotron/attentions.py",
    "content": "import torch\nfrom scipy.stats import betabinom\nfrom torch import nn\nfrom torch.nn import functional as F\n\nfrom TTS.tts.layers.tacotron.common_layers import Linear\n\n\nclass LocationLayer(nn.Module):\n    \"\"\"Layers for Location Sensitive Attention\n\n    Args:\n        attention_dim (int): number of channels in the input tensor.\n        attention_n_filters (int, optional): number of filters in convolution. Defaults to 32.\n        attention_kernel_size (int, optional): kernel size of convolution filter. Defaults to 31.\n    \"\"\"\n\n    def __init__(self, attention_dim, attention_n_filters=32, attention_kernel_size=31):\n        super().__init__()\n        self.location_conv1d = nn.Conv1d(\n            in_channels=2,\n            out_channels=attention_n_filters,\n            kernel_size=attention_kernel_size,\n            stride=1,\n            padding=(attention_kernel_size - 1) // 2,\n            bias=False,\n        )\n        self.location_dense = Linear(attention_n_filters, attention_dim, bias=False, init_gain=\"tanh\")\n\n    def forward(self, attention_cat):\n        \"\"\"\n        Shapes:\n            attention_cat: [B, 2, C]\n        \"\"\"\n        processed_attention = self.location_conv1d(attention_cat)\n        processed_attention = self.location_dense(processed_attention.transpose(1, 2))\n        return processed_attention\n\n\nclass GravesAttention(nn.Module):\n    \"\"\"Graves Attention as is ref1 with updates from ref2.\n    ref1: https://arxiv.org/abs/1910.10288\n    ref2: https://arxiv.org/pdf/1906.01083.pdf\n\n    Args:\n        query_dim (int): number of channels in query tensor.\n        K (int): number of Gaussian heads to be used for computing attention.\n    \"\"\"\n\n    COEF = 0.3989422917366028  # numpy.sqrt(1/(2*numpy.pi))\n\n    def __init__(self, query_dim, K):\n        super().__init__()\n        self._mask_value = 1e-8\n        self.K = K\n        # self.attention_alignment = 0.05\n        self.eps = 1e-5\n        self.J = None\n        self.N_a = nn.Sequential(\n            nn.Linear(query_dim, query_dim, bias=True), nn.ReLU(), nn.Linear(query_dim, 3 * K, bias=True)\n        )\n        self.attention_weights = None\n        self.mu_prev = None\n        self.init_layers()\n\n    def init_layers(self):\n        torch.nn.init.constant_(self.N_a[2].bias[(2 * self.K) : (3 * self.K)], 1.0)  # bias mean\n        torch.nn.init.constant_(self.N_a[2].bias[self.K : (2 * self.K)], 10)  # bias std\n\n    def init_states(self, inputs):\n        if self.J is None or inputs.shape[1] + 1 > self.J.shape[-1]:\n            self.J = torch.arange(0, inputs.shape[1] + 2.0).to(inputs.device) + 0.5\n        self.attention_weights = torch.zeros(inputs.shape[0], inputs.shape[1]).to(inputs.device)\n        self.mu_prev = torch.zeros(inputs.shape[0], self.K).to(inputs.device)\n\n    # pylint: disable=R0201\n    # pylint: disable=unused-argument\n    def preprocess_inputs(self, inputs):\n        return None\n\n    def forward(self, query, inputs, processed_inputs, mask):\n        \"\"\"\n        Shapes:\n            query: [B, C_attention_rnn]\n            inputs: [B, T_in, C_encoder]\n            processed_inputs: place_holder\n            mask: [B, T_in]\n        \"\"\"\n        gbk_t = self.N_a(query)\n        gbk_t = gbk_t.view(gbk_t.size(0), -1, self.K)\n\n        # attention model parameters\n        # each B x K\n        g_t = gbk_t[:, 0, :]\n        b_t = gbk_t[:, 1, :]\n        k_t = gbk_t[:, 2, :]\n\n        # dropout to decorrelate attention heads\n        g_t = torch.nn.functional.dropout(g_t, p=0.5, training=self.training)\n\n        # attention GMM parameters\n        sig_t = torch.nn.functional.softplus(b_t) + self.eps\n\n        mu_t = self.mu_prev + torch.nn.functional.softplus(k_t)\n        g_t = torch.softmax(g_t, dim=-1) + self.eps\n\n        j = self.J[: inputs.size(1) + 1]\n\n        # attention weights\n        phi_t = g_t.unsqueeze(-1) * (1 / (1 + torch.sigmoid((mu_t.unsqueeze(-1) - j) / sig_t.unsqueeze(-1))))\n\n        # discritize attention weights\n        alpha_t = torch.sum(phi_t, 1)\n        alpha_t = alpha_t[:, 1:] - alpha_t[:, :-1]\n        alpha_t[alpha_t == 0] = 1e-8\n\n        # apply masking\n        if mask is not None:\n            alpha_t.data.masked_fill_(~mask, self._mask_value)\n\n        context = torch.bmm(alpha_t.unsqueeze(1), inputs).squeeze(1)\n        self.attention_weights = alpha_t\n        self.mu_prev = mu_t\n        return context\n\n\nclass OriginalAttention(nn.Module):\n    \"\"\"Bahdanau Attention with various optional modifications.\n    - Location sensitive attnetion: https://arxiv.org/abs/1712.05884\n    - Forward Attention: https://arxiv.org/abs/1807.06736 + state masking at inference\n    - Using sigmoid instead of softmax normalization\n    - Attention windowing at inference time\n\n    Note:\n        Location Sensitive Attention extends the additive attention mechanism\n    to use cumulative attention weights from previous decoder time steps with the current time step features.\n\n        Forward attention computes most probable monotonic alignment. The modified attention probabilities at each\n    timestep are computed recursively by the forward algorithm.\n\n        Transition agent in the forward attention explicitly gates the attention mechanism whether to move forward or\n    stay at each decoder timestep.\n\n        Attention windowing is a inductive prior that prevents the model from attending to previous and future timesteps\n    beyond a certain window.\n\n    Args:\n        query_dim (int): number of channels in the query tensor.\n        embedding_dim (int): number of channels in the vakue tensor. In general, the value tensor is the output of the encoder layer.\n        attention_dim (int): number of channels of the inner attention layers.\n        location_attention (bool): enable/disable location sensitive attention.\n        attention_location_n_filters (int): number of location attention filters.\n        attention_location_kernel_size (int): filter size of location attention convolution layer.\n        windowing (int): window size for attention windowing. if it is 5, for computing the attention, it only considers the time steps [(t-5), ..., (t+5)] of the input.\n        norm (str): normalization method applied to the attention weights. 'softmax' or 'sigmoid'\n        forward_attn (bool): enable/disable forward attention.\n        trans_agent (bool): enable/disable transition agent in the forward attention.\n        forward_attn_mask (int): enable/disable an explicit masking in forward attention. It is useful to set at especially inference time.\n    \"\"\"\n\n    # Pylint gets confused by PyTorch conventions here\n    # pylint: disable=attribute-defined-outside-init\n    def __init__(\n        self,\n        query_dim,\n        embedding_dim,\n        attention_dim,\n        location_attention,\n        attention_location_n_filters,\n        attention_location_kernel_size,\n        windowing,\n        norm,\n        forward_attn,\n        trans_agent,\n        forward_attn_mask,\n    ):\n        super().__init__()\n        self.query_layer = Linear(query_dim, attention_dim, bias=False, init_gain=\"tanh\")\n        self.inputs_layer = Linear(embedding_dim, attention_dim, bias=False, init_gain=\"tanh\")\n        self.v = Linear(attention_dim, 1, bias=True)\n        if trans_agent:\n            self.ta = nn.Linear(query_dim + embedding_dim, 1, bias=True)\n        if location_attention:\n            self.location_layer = LocationLayer(\n                attention_dim,\n                attention_location_n_filters,\n                attention_location_kernel_size,\n            )\n        self._mask_value = -float(\"inf\")\n        self.windowing = windowing\n        self.win_idx = None\n        self.norm = norm\n        self.forward_attn = forward_attn\n        self.trans_agent = trans_agent\n        self.forward_attn_mask = forward_attn_mask\n        self.location_attention = location_attention\n\n    def init_win_idx(self):\n        self.win_idx = -1\n        self.win_back = 2\n        self.win_front = 6\n\n    def init_forward_attn(self, inputs):\n        B = inputs.shape[0]\n        T = inputs.shape[1]\n        self.alpha = torch.cat([torch.ones([B, 1]), torch.zeros([B, T])[:, :-1] + 1e-7], dim=1).to(inputs.device)\n        self.u = (0.5 * torch.ones([B, 1])).to(inputs.device)\n\n    def init_location_attention(self, inputs):\n        B = inputs.size(0)\n        T = inputs.size(1)\n        self.attention_weights_cum = torch.zeros([B, T], device=inputs.device)\n\n    def init_states(self, inputs):\n        B = inputs.size(0)\n        T = inputs.size(1)\n        self.attention_weights = torch.zeros([B, T], device=inputs.device)\n        if self.location_attention:\n            self.init_location_attention(inputs)\n        if self.forward_attn:\n            self.init_forward_attn(inputs)\n        if self.windowing:\n            self.init_win_idx()\n\n    def preprocess_inputs(self, inputs):\n        return self.inputs_layer(inputs)\n\n    def update_location_attention(self, alignments):\n        self.attention_weights_cum += alignments\n\n    def get_location_attention(self, query, processed_inputs):\n        attention_cat = torch.cat((self.attention_weights.unsqueeze(1), self.attention_weights_cum.unsqueeze(1)), dim=1)\n        processed_query = self.query_layer(query.unsqueeze(1))\n        processed_attention_weights = self.location_layer(attention_cat)\n        energies = self.v(torch.tanh(processed_query + processed_attention_weights + processed_inputs))\n        energies = energies.squeeze(-1)\n        return energies, processed_query\n\n    def get_attention(self, query, processed_inputs):\n        processed_query = self.query_layer(query.unsqueeze(1))\n        energies = self.v(torch.tanh(processed_query + processed_inputs))\n        energies = energies.squeeze(-1)\n        return energies, processed_query\n\n    def apply_windowing(self, attention, inputs):\n        back_win = self.win_idx - self.win_back\n        front_win = self.win_idx + self.win_front\n        if back_win > 0:\n            attention[:, :back_win] = -float(\"inf\")\n        if front_win < inputs.shape[1]:\n            attention[:, front_win:] = -float(\"inf\")\n        # this is a trick to solve a special problem.\n        # but it does not hurt.\n        if self.win_idx == -1:\n            attention[:, 0] = attention.max()\n        # Update the window\n        self.win_idx = torch.argmax(attention, 1).long()[0].item()\n        return attention\n\n    def apply_forward_attention(self, alignment):\n        # forward attention\n        fwd_shifted_alpha = F.pad(self.alpha[:, :-1].clone().to(alignment.device), (1, 0, 0, 0))\n        # compute transition potentials\n        alpha = ((1 - self.u) * self.alpha + self.u * fwd_shifted_alpha + 1e-8) * alignment\n        # force incremental alignment\n        if not self.training and self.forward_attn_mask:\n            _, n = fwd_shifted_alpha.max(1)\n            val, _ = alpha.max(1)\n            for b in range(alignment.shape[0]):\n                alpha[b, n[b] + 3 :] = 0\n                alpha[b, : (n[b] - 1)] = 0  # ignore all previous states to prevent repetition.\n                alpha[b, (n[b] - 2)] = 0.01 * val[b]  # smoothing factor for the prev step\n        # renormalize attention weights\n        alpha = alpha / alpha.sum(dim=1, keepdim=True)\n        return alpha\n\n    def forward(self, query, inputs, processed_inputs, mask):\n        \"\"\"\n        shapes:\n            query: [B, C_attn_rnn]\n            inputs: [B, T_en, D_en]\n            processed_inputs: [B, T_en, D_attn]\n            mask: [B, T_en]\n        \"\"\"\n        if self.location_attention:\n            attention, _ = self.get_location_attention(query, processed_inputs)\n        else:\n            attention, _ = self.get_attention(query, processed_inputs)\n        # apply masking\n        if mask is not None:\n            attention.data.masked_fill_(~mask, self._mask_value)\n        # apply windowing - only in eval mode\n        if not self.training and self.windowing:\n            attention = self.apply_windowing(attention, inputs)\n\n        # normalize attention values\n        if self.norm == \"softmax\":\n            alignment = torch.softmax(attention, dim=-1)\n        elif self.norm == \"sigmoid\":\n            alignment = torch.sigmoid(attention) / torch.sigmoid(attention).sum(dim=1, keepdim=True)\n        else:\n            raise ValueError(\"Unknown value for attention norm type\")\n\n        if self.location_attention:\n            self.update_location_attention(alignment)\n\n        # apply forward attention if enabled\n        if self.forward_attn:\n            alignment = self.apply_forward_attention(alignment)\n            self.alpha = alignment\n\n        context = torch.bmm(alignment.unsqueeze(1), inputs)\n        context = context.squeeze(1)\n        self.attention_weights = alignment\n\n        # compute transition agent\n        if self.forward_attn and self.trans_agent:\n            ta_input = torch.cat([context, query.squeeze(1)], dim=-1)\n            self.u = torch.sigmoid(self.ta(ta_input))\n        return context\n\n\nclass MonotonicDynamicConvolutionAttention(nn.Module):\n    \"\"\"Dynamic convolution attention from\n    https://arxiv.org/pdf/1910.10288.pdf\n\n\n    query -> linear -> tanh -> linear ->|\n                                        |                                            mask values\n                                        v                                              |    |\n               atten_w(t-1) -|-> conv1d_dynamic -> linear -|-> tanh -> + -> softmax -> * -> * -> context\n                             |-> conv1d_static  -> linear -|           |\n                             |-> conv1d_prior   -> log ----------------|\n\n    query: attention rnn output.\n\n    Note:\n        Dynamic convolution attention is an alternation of the location senstive attention with\n    dynamically computed convolution filters from the previous attention scores and a set of\n    constraints to keep the attention alignment diagonal.\n        DCA is sensitive to mixed precision training and might cause instable training.\n\n    Args:\n        query_dim (int): number of channels in the query tensor.\n        embedding_dim (int): number of channels in the value tensor.\n        static_filter_dim (int): number of channels in the convolution layer computing the static filters.\n        static_kernel_size (int): kernel size for the convolution layer computing the static filters.\n        dynamic_filter_dim (int): number of channels in the convolution layer computing the dynamic filters.\n        dynamic_kernel_size (int): kernel size for the convolution layer computing the dynamic filters.\n        prior_filter_len (int, optional): [description]. Defaults to 11 from the paper.\n        alpha (float, optional): [description]. Defaults to 0.1 from the paper.\n        beta (float, optional): [description]. Defaults to 0.9 from the paper.\n    \"\"\"\n\n    def __init__(\n        self,\n        query_dim,\n        embedding_dim,  # pylint: disable=unused-argument\n        attention_dim,\n        static_filter_dim,\n        static_kernel_size,\n        dynamic_filter_dim,\n        dynamic_kernel_size,\n        prior_filter_len=11,\n        alpha=0.1,\n        beta=0.9,\n    ):\n        super().__init__()\n        self._mask_value = 1e-8\n        self.dynamic_filter_dim = dynamic_filter_dim\n        self.dynamic_kernel_size = dynamic_kernel_size\n        self.prior_filter_len = prior_filter_len\n        self.attention_weights = None\n        # setup key and query layers\n        self.query_layer = nn.Linear(query_dim, attention_dim)\n        self.key_layer = nn.Linear(attention_dim, dynamic_filter_dim * dynamic_kernel_size, bias=False)\n        self.static_filter_conv = nn.Conv1d(\n            1,\n            static_filter_dim,\n            static_kernel_size,\n            padding=(static_kernel_size - 1) // 2,\n            bias=False,\n        )\n        self.static_filter_layer = nn.Linear(static_filter_dim, attention_dim, bias=False)\n        self.dynamic_filter_layer = nn.Linear(dynamic_filter_dim, attention_dim)\n        self.v = nn.Linear(attention_dim, 1, bias=False)\n\n        prior = betabinom.pmf(range(prior_filter_len), prior_filter_len - 1, alpha, beta)\n        self.register_buffer(\"prior\", torch.FloatTensor(prior).flip(0))\n\n    # pylint: disable=unused-argument\n    def forward(self, query, inputs, processed_inputs, mask):\n        \"\"\"\n        query: [B, C_attn_rnn]\n        inputs: [B, T_en, D_en]\n        processed_inputs: place holder.\n        mask: [B, T_en]\n        \"\"\"\n        # compute prior filters\n        prior_filter = F.conv1d(\n            F.pad(self.attention_weights.unsqueeze(1), (self.prior_filter_len - 1, 0)), self.prior.view(1, 1, -1)\n        )\n        prior_filter = torch.log(prior_filter.clamp_min_(1e-6)).squeeze(1)\n        G = self.key_layer(torch.tanh(self.query_layer(query)))\n        # compute dynamic filters\n        dynamic_filter = F.conv1d(\n            self.attention_weights.unsqueeze(0),\n            G.view(-1, 1, self.dynamic_kernel_size),\n            padding=(self.dynamic_kernel_size - 1) // 2,\n            groups=query.size(0),\n        )\n        dynamic_filter = dynamic_filter.view(query.size(0), self.dynamic_filter_dim, -1).transpose(1, 2)\n        # compute static filters\n        static_filter = self.static_filter_conv(self.attention_weights.unsqueeze(1)).transpose(1, 2)\n        alignment = (\n            self.v(\n                torch.tanh(self.static_filter_layer(static_filter) + self.dynamic_filter_layer(dynamic_filter))\n            ).squeeze(-1)\n            + prior_filter\n        )\n        # compute attention weights\n        attention_weights = F.softmax(alignment, dim=-1)\n        # apply masking\n        if mask is not None:\n            attention_weights.data.masked_fill_(~mask, self._mask_value)\n        self.attention_weights = attention_weights\n        # compute context\n        context = torch.bmm(attention_weights.unsqueeze(1), inputs).squeeze(1)\n        return context\n\n    def preprocess_inputs(self, inputs):  # pylint: disable=no-self-use\n        return None\n\n    def init_states(self, inputs):\n        B = inputs.size(0)\n        T = inputs.size(1)\n        self.attention_weights = torch.zeros([B, T], device=inputs.device)\n        self.attention_weights[:, 0] = 1.0\n\n\ndef init_attn(\n    attn_type,\n    query_dim,\n    embedding_dim,\n    attention_dim,\n    location_attention,\n    attention_location_n_filters,\n    attention_location_kernel_size,\n    windowing,\n    norm,\n    forward_attn,\n    trans_agent,\n    forward_attn_mask,\n    attn_K,\n):\n    if attn_type == \"original\":\n        return OriginalAttention(\n            query_dim,\n            embedding_dim,\n            attention_dim,\n            location_attention,\n            attention_location_n_filters,\n            attention_location_kernel_size,\n            windowing,\n            norm,\n            forward_attn,\n            trans_agent,\n            forward_attn_mask,\n        )\n    if attn_type == \"graves\":\n        return GravesAttention(query_dim, attn_K)\n    if attn_type == \"dynamic_convolution\":\n        return MonotonicDynamicConvolutionAttention(\n            query_dim,\n            embedding_dim,\n            attention_dim,\n            static_filter_dim=8,\n            static_kernel_size=21,\n            dynamic_filter_dim=8,\n            dynamic_kernel_size=21,\n            prior_filter_len=11,\n            alpha=0.1,\n            beta=0.9,\n        )\n\n    raise RuntimeError(f\" [!] Given Attention Type '{attn_type}' is not exist.\")\n"
  },
  {
    "path": "TTS/tts/layers/tacotron/capacitron_layers.py",
    "content": "import torch\nfrom torch import nn\nfrom torch.distributions.multivariate_normal import MultivariateNormal as MVN\nfrom torch.nn import functional as F\n\n\nclass CapacitronVAE(nn.Module):\n    \"\"\"Effective Use of Variational Embedding Capacity for prosody transfer.\n\n    See https://arxiv.org/abs/1906.03402\"\"\"\n\n    def __init__(\n        self,\n        num_mel,\n        capacitron_VAE_embedding_dim,\n        encoder_output_dim=256,\n        reference_encoder_out_dim=128,\n        speaker_embedding_dim=None,\n        text_summary_embedding_dim=None,\n    ):\n        super().__init__()\n        # Init distributions\n        self.prior_distribution = MVN(\n            torch.zeros(capacitron_VAE_embedding_dim), torch.eye(capacitron_VAE_embedding_dim)\n        )\n        self.approximate_posterior_distribution = None\n        # define output ReferenceEncoder dim to the capacitron_VAE_embedding_dim\n        self.encoder = ReferenceEncoder(num_mel, out_dim=reference_encoder_out_dim)\n\n        # Init beta, the lagrange-like term for the KL distribution\n        self.beta = torch.nn.Parameter(torch.log(torch.exp(torch.Tensor([1.0])) - 1), requires_grad=True)\n        mlp_input_dimension = reference_encoder_out_dim\n\n        if text_summary_embedding_dim is not None:\n            self.text_summary_net = TextSummary(text_summary_embedding_dim, encoder_output_dim=encoder_output_dim)\n            mlp_input_dimension += text_summary_embedding_dim\n        if speaker_embedding_dim is not None:\n            # TODO: Test a multispeaker model!\n            mlp_input_dimension += speaker_embedding_dim\n        self.post_encoder_mlp = PostEncoderMLP(mlp_input_dimension, capacitron_VAE_embedding_dim)\n\n    def forward(self, reference_mel_info=None, text_info=None, speaker_embedding=None):\n        # Use reference\n        if reference_mel_info is not None:\n            reference_mels = reference_mel_info[0]  # [batch_size, num_frames, num_mels]\n            mel_lengths = reference_mel_info[1]  # [batch_size]\n            enc_out = self.encoder(reference_mels, mel_lengths)\n\n            # concat speaker_embedding and/or text summary embedding\n            if text_info is not None:\n                text_inputs = text_info[0]  # [batch_size, num_characters, num_embedding]\n                input_lengths = text_info[1]\n                text_summary_out = self.text_summary_net(text_inputs, input_lengths).to(reference_mels.device)\n                enc_out = torch.cat([enc_out, text_summary_out], dim=-1)\n            if speaker_embedding is not None:\n                speaker_embedding = torch.squeeze(speaker_embedding)\n                enc_out = torch.cat([enc_out, speaker_embedding], dim=-1)\n\n            # Feed the output of the ref encoder and information about text/speaker into\n            # an MLP to produce the parameteres for the approximate poterior distributions\n            mu, sigma = self.post_encoder_mlp(enc_out)\n            # convert to cpu because prior_distribution was created on cpu\n            mu = mu.cpu()\n            sigma = sigma.cpu()\n\n            # Sample from the posterior: z ~ q(z|x)\n            self.approximate_posterior_distribution = MVN(mu, torch.diag_embed(sigma))\n            VAE_embedding = self.approximate_posterior_distribution.rsample()\n        # Infer from the model, bypasses encoding\n        else:\n            # Sample from the prior: z ~ p(z)\n            VAE_embedding = self.prior_distribution.sample().unsqueeze(0)\n\n        # reshape to [batch_size, 1, capacitron_VAE_embedding_dim]\n        return VAE_embedding.unsqueeze(1), self.approximate_posterior_distribution, self.prior_distribution, self.beta\n\n\nclass ReferenceEncoder(nn.Module):\n    \"\"\"NN module creating a fixed size prosody embedding from a spectrogram.\n\n    inputs: mel spectrograms [batch_size, num_spec_frames, num_mel]\n    outputs: [batch_size, embedding_dim]\n    \"\"\"\n\n    def __init__(self, num_mel, out_dim):\n        super().__init__()\n        self.num_mel = num_mel\n        filters = [1] + [32, 32, 64, 64, 128, 128]\n        num_layers = len(filters) - 1\n        convs = [\n            nn.Conv2d(\n                in_channels=filters[i], out_channels=filters[i + 1], kernel_size=(3, 3), stride=(2, 2), padding=(2, 2)\n            )\n            for i in range(num_layers)\n        ]\n        self.convs = nn.ModuleList(convs)\n        self.training = False\n        self.bns = nn.ModuleList([nn.BatchNorm2d(num_features=filter_size) for filter_size in filters[1:]])\n\n        post_conv_height = self.calculate_post_conv_height(num_mel, 3, 2, 2, num_layers)\n        self.recurrence = nn.LSTM(\n            input_size=filters[-1] * post_conv_height, hidden_size=out_dim, batch_first=True, bidirectional=False\n        )\n\n    def forward(self, inputs, input_lengths):\n        batch_size = inputs.size(0)\n        x = inputs.view(batch_size, 1, -1, self.num_mel)  # [batch_size, num_channels==1, num_frames, num_mel]\n        valid_lengths = input_lengths.float()  # [batch_size]\n        for conv, bn in zip(self.convs, self.bns):\n            x = conv(x)\n            x = bn(x)\n            x = F.relu(x)\n\n            # Create the post conv width mask based on the valid lengths of the output of the convolution.\n            # The valid lengths for the output of a convolution on varying length inputs is\n            # ceil(input_length/stride) + 1 for stride=3 and padding=2\n            # For example (kernel_size=3, stride=2, padding=2):\n            # 0 0 x x x x x 0 0 -> Input = 5, 0 is zero padding, x is valid values coming from padding=2 in conv2d\n            # _____\n            #   x _____\n            #       x _____\n            #           x  ____\n            #               x\n            # x x x x -> Output valid length = 4\n            # Since every example in te batch is zero padded and therefore have separate valid_lengths,\n            # we need to mask off all the values AFTER the valid length for each example in the batch.\n            # Otherwise, the convolutions create noise and a lot of not real information\n            valid_lengths = (valid_lengths / 2).float()\n            valid_lengths = torch.ceil(valid_lengths).to(dtype=torch.int64) + 1  # 2 is stride -- size: [batch_size]\n            post_conv_max_width = x.size(2)\n\n            mask = torch.arange(post_conv_max_width).to(inputs.device).expand(\n                len(valid_lengths), post_conv_max_width\n            ) < valid_lengths.unsqueeze(1)\n            mask = mask.expand(1, 1, -1, -1).transpose(2, 0).transpose(-1, 2)  # [batch_size, 1, post_conv_max_width, 1]\n            x = x * mask\n\n        x = x.transpose(1, 2)\n        # x: 4D tensor [batch_size, post_conv_width,\n        #               num_channels==128, post_conv_height]\n\n        post_conv_width = x.size(1)\n        x = x.contiguous().view(batch_size, post_conv_width, -1)\n        # x: 3D tensor [batch_size, post_conv_width,\n        #               num_channels*post_conv_height]\n\n        # Routine for fetching the last valid output of a dynamic LSTM with varying input lengths and padding\n        post_conv_input_lengths = valid_lengths\n        packed_seqs = nn.utils.rnn.pack_padded_sequence(\n            x, post_conv_input_lengths.tolist(), batch_first=True, enforce_sorted=False\n        )  # dynamic rnn sequence padding\n        self.recurrence.flatten_parameters()\n        _, (ht, _) = self.recurrence(packed_seqs)\n        last_output = ht[-1]\n\n        return last_output.to(inputs.device)  # [B, 128]\n\n    @staticmethod\n    def calculate_post_conv_height(height, kernel_size, stride, pad, n_convs):\n        \"\"\"Height of spec after n convolutions with fixed kernel/stride/pad.\"\"\"\n        for _ in range(n_convs):\n            height = (height - kernel_size + 2 * pad) // stride + 1\n        return height\n\n\nclass TextSummary(nn.Module):\n    def __init__(self, embedding_dim, encoder_output_dim):\n        super().__init__()\n        self.lstm = nn.LSTM(\n            encoder_output_dim,  # text embedding dimension from the text encoder\n            embedding_dim,  # fixed length output summary the lstm creates from the input\n            batch_first=True,\n            bidirectional=False,\n        )\n\n    def forward(self, inputs, input_lengths):\n        # Routine for fetching the last valid output of a dynamic LSTM with varying input lengths and padding\n        packed_seqs = nn.utils.rnn.pack_padded_sequence(\n            inputs, input_lengths.tolist(), batch_first=True, enforce_sorted=False\n        )  # dynamic rnn sequence padding\n        self.lstm.flatten_parameters()\n        _, (ht, _) = self.lstm(packed_seqs)\n        last_output = ht[-1]\n        return last_output\n\n\nclass PostEncoderMLP(nn.Module):\n    def __init__(self, input_size, hidden_size):\n        super().__init__()\n        self.hidden_size = hidden_size\n        modules = [\n            nn.Linear(input_size, hidden_size),  # Hidden Layer\n            nn.Tanh(),\n            nn.Linear(hidden_size, hidden_size * 2),\n        ]  # Output layer twice the size for mean and variance\n        self.net = nn.Sequential(*modules)\n        self.softplus = nn.Softplus()\n\n    def forward(self, _input):\n        mlp_output = self.net(_input)\n        # The mean parameter is unconstrained\n        mu = mlp_output[:, : self.hidden_size]\n        # The standard deviation must be positive. Parameterise with a softplus\n        sigma = self.softplus(mlp_output[:, self.hidden_size :])\n        return mu, sigma\n"
  },
  {
    "path": "TTS/tts/layers/tacotron/common_layers.py",
    "content": "import torch\nfrom torch import nn\nfrom torch.nn import functional as F\n\n\nclass Linear(nn.Module):\n    \"\"\"Linear layer with a specific initialization.\n\n    Args:\n        in_features (int): number of channels in the input tensor.\n        out_features (int): number of channels in the output tensor.\n        bias (bool, optional): enable/disable bias in the layer. Defaults to True.\n        init_gain (str, optional): method to compute the gain in the weight initializtion based on the nonlinear activation used afterwards. Defaults to 'linear'.\n    \"\"\"\n\n    def __init__(self, in_features, out_features, bias=True, init_gain=\"linear\"):\n        super().__init__()\n        self.linear_layer = torch.nn.Linear(in_features, out_features, bias=bias)\n        self._init_w(init_gain)\n\n    def _init_w(self, init_gain):\n        torch.nn.init.xavier_uniform_(self.linear_layer.weight, gain=torch.nn.init.calculate_gain(init_gain))\n\n    def forward(self, x):\n        return self.linear_layer(x)\n\n\nclass LinearBN(nn.Module):\n    \"\"\"Linear layer with Batch Normalization.\n\n    x -> linear -> BN -> o\n\n    Args:\n        in_features (int): number of channels in the input tensor.\n        out_features (int ): number of channels in the output tensor.\n        bias (bool, optional): enable/disable bias in the linear layer. Defaults to True.\n        init_gain (str, optional): method to set the gain for weight initialization. Defaults to 'linear'.\n    \"\"\"\n\n    def __init__(self, in_features, out_features, bias=True, init_gain=\"linear\"):\n        super().__init__()\n        self.linear_layer = torch.nn.Linear(in_features, out_features, bias=bias)\n        self.batch_normalization = nn.BatchNorm1d(out_features, momentum=0.1, eps=1e-5)\n        self._init_w(init_gain)\n\n    def _init_w(self, init_gain):\n        torch.nn.init.xavier_uniform_(self.linear_layer.weight, gain=torch.nn.init.calculate_gain(init_gain))\n\n    def forward(self, x):\n        \"\"\"\n        Shapes:\n            x: [T, B, C] or [B, C]\n        \"\"\"\n        out = self.linear_layer(x)\n        if len(out.shape) == 3:\n            out = out.permute(1, 2, 0)\n        out = self.batch_normalization(out)\n        if len(out.shape) == 3:\n            out = out.permute(2, 0, 1)\n        return out\n\n\nclass Prenet(nn.Module):\n    \"\"\"Tacotron specific Prenet with an optional Batch Normalization.\n\n    Note:\n        Prenet with BN improves the model performance significantly especially\n    if it is enabled after learning a diagonal attention alignment with the original\n    prenet. However, if the target dataset is high quality then it also works from\n    the start. It is also suggested to disable dropout if BN is in use.\n\n        prenet_type == \"original\"\n            x -> [linear -> ReLU -> Dropout]xN -> o\n\n        prenet_type == \"bn\"\n            x -> [linear -> BN -> ReLU -> Dropout]xN -> o\n\n    Args:\n        in_features (int): number of channels in the input tensor and the inner layers.\n        prenet_type (str, optional): prenet type \"original\" or \"bn\". Defaults to \"original\".\n        prenet_dropout (bool, optional): dropout rate. Defaults to True.\n        dropout_at_inference (bool, optional): use dropout at inference. It leads to a better quality for some models.\n        out_features (list, optional): List of output channels for each prenet block.\n            It also defines number of the prenet blocks based on the length of argument list.\n            Defaults to [256, 256].\n        bias (bool, optional): enable/disable bias in prenet linear layers. Defaults to True.\n    \"\"\"\n\n    # pylint: disable=dangerous-default-value\n    def __init__(\n        self,\n        in_features,\n        prenet_type=\"original\",\n        prenet_dropout=True,\n        dropout_at_inference=False,\n        out_features=[256, 256],\n        bias=True,\n    ):\n        super().__init__()\n        self.prenet_type = prenet_type\n        self.prenet_dropout = prenet_dropout\n        self.dropout_at_inference = dropout_at_inference\n        in_features = [in_features] + out_features[:-1]\n        if prenet_type == \"bn\":\n            self.linear_layers = nn.ModuleList(\n                [LinearBN(in_size, out_size, bias=bias) for (in_size, out_size) in zip(in_features, out_features)]\n            )\n        elif prenet_type == \"original\":\n            self.linear_layers = nn.ModuleList(\n                [Linear(in_size, out_size, bias=bias) for (in_size, out_size) in zip(in_features, out_features)]\n            )\n\n    def forward(self, x):\n        for linear in self.linear_layers:\n            if self.prenet_dropout:\n                x = F.dropout(F.relu(linear(x)), p=0.5, training=self.training or self.dropout_at_inference)\n            else:\n                x = F.relu(linear(x))\n        return x\n"
  },
  {
    "path": "TTS/tts/layers/tacotron/gst_layers.py",
    "content": "import torch\nimport torch.nn.functional as F\nfrom torch import nn\n\n\nclass GST(nn.Module):\n    \"\"\"Global Style Token Module for factorizing prosody in speech.\n\n    See https://arxiv.org/pdf/1803.09017\"\"\"\n\n    def __init__(self, num_mel, num_heads, num_style_tokens, gst_embedding_dim, embedded_speaker_dim=None):\n        super().__init__()\n        self.encoder = ReferenceEncoder(num_mel, gst_embedding_dim)\n        self.style_token_layer = StyleTokenLayer(num_heads, num_style_tokens, gst_embedding_dim, embedded_speaker_dim)\n\n    def forward(self, inputs, speaker_embedding=None):\n        enc_out = self.encoder(inputs)\n        # concat speaker_embedding\n        if speaker_embedding is not None:\n            enc_out = torch.cat([enc_out, speaker_embedding], dim=-1)\n        style_embed = self.style_token_layer(enc_out)\n\n        return style_embed\n\n\nclass ReferenceEncoder(nn.Module):\n    \"\"\"NN module creating a fixed size prosody embedding from a spectrogram.\n\n    inputs: mel spectrograms [batch_size, num_spec_frames, num_mel]\n    outputs: [batch_size, embedding_dim]\n    \"\"\"\n\n    def __init__(self, num_mel, embedding_dim):\n        super().__init__()\n        self.num_mel = num_mel\n        filters = [1] + [32, 32, 64, 64, 128, 128]\n        num_layers = len(filters) - 1\n        convs = [\n            nn.Conv2d(\n                in_channels=filters[i], out_channels=filters[i + 1], kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)\n            )\n            for i in range(num_layers)\n        ]\n        self.convs = nn.ModuleList(convs)\n        self.bns = nn.ModuleList([nn.BatchNorm2d(num_features=filter_size) for filter_size in filters[1:]])\n\n        post_conv_height = self.calculate_post_conv_height(num_mel, 3, 2, 1, num_layers)\n        self.recurrence = nn.GRU(\n            input_size=filters[-1] * post_conv_height, hidden_size=embedding_dim // 2, batch_first=True\n        )\n\n    def forward(self, inputs):\n        batch_size = inputs.size(0)\n        x = inputs.view(batch_size, 1, -1, self.num_mel)\n        # x: 4D tensor [batch_size, num_channels==1, num_frames, num_mel]\n        for conv, bn in zip(self.convs, self.bns):\n            x = conv(x)\n            x = bn(x)\n            x = F.relu(x)\n\n        x = x.transpose(1, 2)\n        # x: 4D tensor [batch_size, post_conv_width,\n        #               num_channels==128, post_conv_height]\n        post_conv_width = x.size(1)\n        x = x.contiguous().view(batch_size, post_conv_width, -1)\n        # x: 3D tensor [batch_size, post_conv_width,\n        #               num_channels*post_conv_height]\n        self.recurrence.flatten_parameters()\n        _, out = self.recurrence(x)\n        # out: 3D tensor [seq_len==1, batch_size, encoding_size=128]\n\n        return out.squeeze(0)\n\n    @staticmethod\n    def calculate_post_conv_height(height, kernel_size, stride, pad, n_convs):\n        \"\"\"Height of spec after n convolutions with fixed kernel/stride/pad.\"\"\"\n        for _ in range(n_convs):\n            height = (height - kernel_size + 2 * pad) // stride + 1\n        return height\n\n\nclass StyleTokenLayer(nn.Module):\n    \"\"\"NN Module attending to style tokens based on prosody encodings.\"\"\"\n\n    def __init__(self, num_heads, num_style_tokens, gst_embedding_dim, d_vector_dim=None):\n        super().__init__()\n\n        self.query_dim = gst_embedding_dim // 2\n\n        if d_vector_dim:\n            self.query_dim += d_vector_dim\n\n        self.key_dim = gst_embedding_dim // num_heads\n        self.style_tokens = nn.Parameter(torch.FloatTensor(num_style_tokens, self.key_dim))\n        nn.init.normal_(self.style_tokens, mean=0, std=0.5)\n        self.attention = MultiHeadAttention(\n            query_dim=self.query_dim, key_dim=self.key_dim, num_units=gst_embedding_dim, num_heads=num_heads\n        )\n\n    def forward(self, inputs):\n        batch_size = inputs.size(0)\n        prosody_encoding = inputs.unsqueeze(1)\n        # prosody_encoding: 3D tensor [batch_size, 1, encoding_size==128]\n        tokens = torch.tanh(self.style_tokens).unsqueeze(0).expand(batch_size, -1, -1)\n        # tokens: 3D tensor [batch_size, num tokens, token embedding size]\n        style_embed = self.attention(prosody_encoding, tokens)\n\n        return style_embed\n\n\nclass MultiHeadAttention(nn.Module):\n    \"\"\"\n    input:\n        query --- [N, T_q, query_dim]\n        key --- [N, T_k, key_dim]\n    output:\n        out --- [N, T_q, num_units]\n    \"\"\"\n\n    def __init__(self, query_dim, key_dim, num_units, num_heads):\n        super().__init__()\n        self.num_units = num_units\n        self.num_heads = num_heads\n        self.key_dim = key_dim\n\n        self.W_query = nn.Linear(in_features=query_dim, out_features=num_units, bias=False)\n        self.W_key = nn.Linear(in_features=key_dim, out_features=num_units, bias=False)\n        self.W_value = nn.Linear(in_features=key_dim, out_features=num_units, bias=False)\n\n    def forward(self, query, key):\n        queries = self.W_query(query)  # [N, T_q, num_units]\n        keys = self.W_key(key)  # [N, T_k, num_units]\n        values = self.W_value(key)\n\n        split_size = self.num_units // self.num_heads\n        queries = torch.stack(torch.split(queries, split_size, dim=2), dim=0)  # [h, N, T_q, num_units/h]\n        keys = torch.stack(torch.split(keys, split_size, dim=2), dim=0)  # [h, N, T_k, num_units/h]\n        values = torch.stack(torch.split(values, split_size, dim=2), dim=0)  # [h, N, T_k, num_units/h]\n\n        # score = softmax(QK^T / (d_k**0.5))\n        scores = torch.matmul(queries, keys.transpose(2, 3))  # [h, N, T_q, T_k]\n        scores = scores / (self.key_dim**0.5)\n        scores = F.softmax(scores, dim=3)\n\n        # out = score * V\n        out = torch.matmul(scores, values)  # [h, N, T_q, num_units/h]\n        out = torch.cat(torch.split(out, 1, dim=0), dim=3).squeeze(0)  # [N, T_q, num_units]\n\n        return out\n"
  },
  {
    "path": "TTS/tts/layers/tacotron/tacotron.py",
    "content": "# coding: utf-8\n# adapted from https://github.com/r9y9/tacotron_pytorch\n\nimport torch\nfrom torch import nn\n\nfrom .attentions import init_attn\nfrom .common_layers import Prenet\n\n\nclass BatchNormConv1d(nn.Module):\n    r\"\"\"A wrapper for Conv1d with BatchNorm. It sets the activation\n    function between Conv and BatchNorm layers. BatchNorm layer\n    is initialized with the TF default values for momentum and eps.\n\n    Args:\n        in_channels: size of each input sample\n        out_channels: size of each output samples\n        kernel_size: kernel size of conv filters\n        stride: stride of conv filters\n        padding: padding of conv filters\n        activation: activation function set b/w Conv1d and BatchNorm\n\n    Shapes:\n        - input: (B, D)\n        - output: (B, D)\n    \"\"\"\n\n    def __init__(self, in_channels, out_channels, kernel_size, stride, padding, activation=None):\n        super().__init__()\n        self.padding = padding\n        self.padder = nn.ConstantPad1d(padding, 0)\n        self.conv1d = nn.Conv1d(\n            in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=0, bias=False\n        )\n        # Following tensorflow's default parameters\n        self.bn = nn.BatchNorm1d(out_channels, momentum=0.99, eps=1e-3)\n        self.activation = activation\n        # self.init_layers()\n\n    def init_layers(self):\n        if isinstance(self.activation, torch.nn.ReLU):\n            w_gain = \"relu\"\n        elif isinstance(self.activation, torch.nn.Tanh):\n            w_gain = \"tanh\"\n        elif self.activation is None:\n            w_gain = \"linear\"\n        else:\n            raise RuntimeError(\"Unknown activation function\")\n        torch.nn.init.xavier_uniform_(self.conv1d.weight, gain=torch.nn.init.calculate_gain(w_gain))\n\n    def forward(self, x):\n        x = self.padder(x)\n        x = self.conv1d(x)\n        x = self.bn(x)\n        if self.activation is not None:\n            x = self.activation(x)\n        return x\n\n\nclass Highway(nn.Module):\n    r\"\"\"Highway layers as explained in https://arxiv.org/abs/1505.00387\n\n    Args:\n        in_features (int): size of each input sample\n        out_feature (int): size of each output sample\n\n    Shapes:\n        - input: (B, *, H_in)\n        - output: (B, *, H_out)\n    \"\"\"\n\n    # TODO: Try GLU layer\n    def __init__(self, in_features, out_feature):\n        super().__init__()\n        self.H = nn.Linear(in_features, out_feature)\n        self.H.bias.data.zero_()\n        self.T = nn.Linear(in_features, out_feature)\n        self.T.bias.data.fill_(-1)\n        self.relu = nn.ReLU()\n        self.sigmoid = nn.Sigmoid()\n        # self.init_layers()\n\n    def init_layers(self):\n        torch.nn.init.xavier_uniform_(self.H.weight, gain=torch.nn.init.calculate_gain(\"relu\"))\n        torch.nn.init.xavier_uniform_(self.T.weight, gain=torch.nn.init.calculate_gain(\"sigmoid\"))\n\n    def forward(self, inputs):\n        H = self.relu(self.H(inputs))\n        T = self.sigmoid(self.T(inputs))\n        return H * T + inputs * (1.0 - T)\n\n\nclass CBHG(nn.Module):\n    \"\"\"CBHG module: a recurrent neural network composed of:\n    - 1-d convolution banks\n    - Highway networks + residual connections\n    - Bidirectional gated recurrent units\n\n    Args:\n        in_features (int): sample size\n        K (int): max filter size in conv bank\n        projections (list): conv channel sizes for conv projections\n        num_highways (int): number of highways layers\n\n    Shapes:\n        - input: (B, C, T_in)\n        - output: (B, T_in, C*2)\n    \"\"\"\n\n    # pylint: disable=dangerous-default-value\n    def __init__(\n        self,\n        in_features,\n        K=16,\n        conv_bank_features=128,\n        conv_projections=[128, 128],\n        highway_features=128,\n        gru_features=128,\n        num_highways=4,\n    ):\n        super().__init__()\n        self.in_features = in_features\n        self.conv_bank_features = conv_bank_features\n        self.highway_features = highway_features\n        self.gru_features = gru_features\n        self.conv_projections = conv_projections\n        self.relu = nn.ReLU()\n        # list of conv1d bank with filter size k=1...K\n        # TODO: try dilational layers instead\n        self.conv1d_banks = nn.ModuleList(\n            [\n                BatchNormConv1d(\n                    in_features,\n                    conv_bank_features,\n                    kernel_size=k,\n                    stride=1,\n                    padding=[(k - 1) // 2, k // 2],\n                    activation=self.relu,\n                )\n                for k in range(1, K + 1)\n            ]\n        )\n        # max pooling of conv bank, with padding\n        # TODO: try average pooling OR larger kernel size\n        out_features = [K * conv_bank_features] + conv_projections[:-1]\n        activations = [self.relu] * (len(conv_projections) - 1)\n        activations += [None]\n        # setup conv1d projection layers\n        layer_set = []\n        for in_size, out_size, ac in zip(out_features, conv_projections, activations):\n            layer = BatchNormConv1d(in_size, out_size, kernel_size=3, stride=1, padding=[1, 1], activation=ac)\n            layer_set.append(layer)\n        self.conv1d_projections = nn.ModuleList(layer_set)\n        # setup Highway layers\n        if self.highway_features != conv_projections[-1]:\n            self.pre_highway = nn.Linear(conv_projections[-1], highway_features, bias=False)\n        self.highways = nn.ModuleList([Highway(highway_features, highway_features) for _ in range(num_highways)])\n        # bi-directional GPU layer\n        self.gru = nn.GRU(gru_features, gru_features, 1, batch_first=True, bidirectional=True)\n\n    def forward(self, inputs):\n        # (B, in_features, T_in)\n        x = inputs\n        # (B, hid_features*K, T_in)\n        # Concat conv1d bank outputs\n        outs = []\n        for conv1d in self.conv1d_banks:\n            out = conv1d(x)\n            outs.append(out)\n        x = torch.cat(outs, dim=1)\n        assert x.size(1) == self.conv_bank_features * len(self.conv1d_banks)\n        for conv1d in self.conv1d_projections:\n            x = conv1d(x)\n        x += inputs\n        x = x.transpose(1, 2)\n        if self.highway_features != self.conv_projections[-1]:\n            x = self.pre_highway(x)\n        # Residual connection\n        # TODO: try residual scaling as in Deep Voice 3\n        # TODO: try plain residual layers\n        for highway in self.highways:\n            x = highway(x)\n        # (B, T_in, hid_features*2)\n        # TODO: replace GRU with convolution as in Deep Voice 3\n        self.gru.flatten_parameters()\n        outputs, _ = self.gru(x)\n        return outputs\n\n\nclass EncoderCBHG(nn.Module):\n    r\"\"\"CBHG module with Encoder specific arguments\"\"\"\n\n    def __init__(self):\n        super().__init__()\n        self.cbhg = CBHG(\n            128,\n            K=16,\n            conv_bank_features=128,\n            conv_projections=[128, 128],\n            highway_features=128,\n            gru_features=128,\n            num_highways=4,\n        )\n\n    def forward(self, x):\n        return self.cbhg(x)\n\n\nclass Encoder(nn.Module):\n    r\"\"\"Stack Prenet and CBHG module for encoder\n    Args:\n        inputs (FloatTensor): embedding features\n\n    Shapes:\n        - inputs: (B, T, D_in)\n        - outputs: (B, T, 128 * 2)\n    \"\"\"\n\n    def __init__(self, in_features):\n        super().__init__()\n        self.prenet = Prenet(in_features, out_features=[256, 128])\n        self.cbhg = EncoderCBHG()\n\n    def forward(self, inputs):\n        # B x T x prenet_dim\n        outputs = self.prenet(inputs)\n        outputs = self.cbhg(outputs.transpose(1, 2))\n        return outputs\n\n\nclass PostCBHG(nn.Module):\n    def __init__(self, mel_dim):\n        super().__init__()\n        self.cbhg = CBHG(\n            mel_dim,\n            K=8,\n            conv_bank_features=128,\n            conv_projections=[256, mel_dim],\n            highway_features=128,\n            gru_features=128,\n            num_highways=4,\n        )\n\n    def forward(self, x):\n        return self.cbhg(x)\n\n\nclass Decoder(nn.Module):\n    \"\"\"Tacotron decoder.\n\n    Args:\n        in_channels (int): number of input channels.\n        frame_channels (int): number of feature frame channels.\n        r (int): number of outputs per time step (reduction rate).\n        memory_size (int): size of the past window. if <= 0 memory_size = r\n        attn_type (string): type of attention used in decoder.\n        attn_windowing (bool): if true, define an attention window centered to maximum\n            attention response. It provides more robust attention alignment especially\n            at interence time.\n        attn_norm (string): attention normalization function. 'sigmoid' or 'softmax'.\n        prenet_type (string): 'original' or 'bn'.\n        prenet_dropout (float): prenet dropout rate.\n        forward_attn (bool): if true, use forward attention method. https://arxiv.org/abs/1807.06736\n        trans_agent (bool): if true, use transition agent. https://arxiv.org/abs/1807.06736\n        forward_attn_mask (bool): if true, mask attention values smaller than a threshold.\n        location_attn (bool): if true, use location sensitive attention.\n        attn_K (int): number of attention heads for GravesAttention.\n        separate_stopnet (bool): if true, detach stopnet input to prevent gradient flow.\n        d_vector_dim (int): size of speaker embedding vector, for multi-speaker training.\n        max_decoder_steps (int): Maximum number of steps allowed for the decoder. Defaults to 500.\n    \"\"\"\n\n    # Pylint gets confused by PyTorch conventions here\n    # pylint: disable=attribute-defined-outside-init\n\n    def __init__(\n        self,\n        in_channels,\n        frame_channels,\n        r,\n        memory_size,\n        attn_type,\n        attn_windowing,\n        attn_norm,\n        prenet_type,\n        prenet_dropout,\n        forward_attn,\n        trans_agent,\n        forward_attn_mask,\n        location_attn,\n        attn_K,\n        separate_stopnet,\n        max_decoder_steps,\n    ):\n        super().__init__()\n        self.r_init = r\n        self.r = r\n        self.in_channels = in_channels\n        self.max_decoder_steps = max_decoder_steps\n        self.use_memory_queue = memory_size > 0\n        self.memory_size = memory_size if memory_size > 0 else r\n        self.frame_channels = frame_channels\n        self.separate_stopnet = separate_stopnet\n        self.query_dim = 256\n        # memory -> |Prenet| -> processed_memory\n        prenet_dim = frame_channels * self.memory_size if self.use_memory_queue else frame_channels\n        self.prenet = Prenet(prenet_dim, prenet_type, prenet_dropout, out_features=[256, 128])\n        # processed_inputs, processed_memory -> |Attention| -> Attention, attention, RNN_State\n        # attention_rnn generates queries for the attention mechanism\n        self.attention_rnn = nn.GRUCell(in_channels + 128, self.query_dim)\n        self.attention = init_attn(\n            attn_type=attn_type,\n            query_dim=self.query_dim,\n            embedding_dim=in_channels,\n            attention_dim=128,\n            location_attention=location_attn,\n            attention_location_n_filters=32,\n            attention_location_kernel_size=31,\n            windowing=attn_windowing,\n            norm=attn_norm,\n            forward_attn=forward_attn,\n            trans_agent=trans_agent,\n            forward_attn_mask=forward_attn_mask,\n            attn_K=attn_K,\n        )\n        # (processed_memory | attention context) -> |Linear| -> decoder_RNN_input\n        self.project_to_decoder_in = nn.Linear(256 + in_channels, 256)\n        # decoder_RNN_input -> |RNN| -> RNN_state\n        self.decoder_rnns = nn.ModuleList([nn.GRUCell(256, 256) for _ in range(2)])\n        # RNN_state -> |Linear| -> mel_spec\n        self.proj_to_mel = nn.Linear(256, frame_channels * self.r_init)\n        # learn init values instead of zero init.\n        self.stopnet = StopNet(256 + frame_channels * self.r_init)\n\n    def set_r(self, new_r):\n        self.r = new_r\n\n    def _reshape_memory(self, memory):\n        \"\"\"\n        Reshape the spectrograms for given 'r'\n        \"\"\"\n        # Grouping multiple frames if necessary\n        if memory.size(-1) == self.frame_channels:\n            memory = memory.view(memory.shape[0], memory.size(1) // self.r, -1)\n        # Time first (T_decoder, B, frame_channels)\n        memory = memory.transpose(0, 1)\n        return memory\n\n    def _init_states(self, inputs):\n        \"\"\"\n        Initialization of decoder states\n        \"\"\"\n        B = inputs.size(0)\n        # go frame as zeros matrix\n        if self.use_memory_queue:\n            self.memory_input = torch.zeros(1, device=inputs.device).repeat(B, self.frame_channels * self.memory_size)\n        else:\n            self.memory_input = torch.zeros(1, device=inputs.device).repeat(B, self.frame_channels)\n        # decoder states\n        self.attention_rnn_hidden = torch.zeros(1, device=inputs.device).repeat(B, 256)\n        self.decoder_rnn_hiddens = [\n            torch.zeros(1, device=inputs.device).repeat(B, 256) for idx in range(len(self.decoder_rnns))\n        ]\n        self.context_vec = inputs.data.new(B, self.in_channels).zero_()\n        # cache attention inputs\n        self.processed_inputs = self.attention.preprocess_inputs(inputs)\n\n    def _parse_outputs(self, outputs, attentions, stop_tokens):\n        # Back to batch first\n        attentions = torch.stack(attentions).transpose(0, 1)\n        stop_tokens = torch.stack(stop_tokens).transpose(0, 1)\n        outputs = torch.stack(outputs).transpose(0, 1).contiguous()\n        outputs = outputs.view(outputs.size(0), -1, self.frame_channels)\n        outputs = outputs.transpose(1, 2)\n        return outputs, attentions, stop_tokens\n\n    def decode(self, inputs, mask=None):\n        # Prenet\n        processed_memory = self.prenet(self.memory_input)\n        # Attention RNN\n        self.attention_rnn_hidden = self.attention_rnn(\n            torch.cat((processed_memory, self.context_vec), -1), self.attention_rnn_hidden\n        )\n        self.context_vec = self.attention(self.attention_rnn_hidden, inputs, self.processed_inputs, mask)\n        # Concat RNN output and attention context vector\n        decoder_input = self.project_to_decoder_in(torch.cat((self.attention_rnn_hidden, self.context_vec), -1))\n\n        # Pass through the decoder RNNs\n        for idx, decoder_rnn in enumerate(self.decoder_rnns):\n            self.decoder_rnn_hiddens[idx] = decoder_rnn(decoder_input, self.decoder_rnn_hiddens[idx])\n            # Residual connection\n            decoder_input = self.decoder_rnn_hiddens[idx] + decoder_input\n        decoder_output = decoder_input\n\n        # predict mel vectors from decoder vectors\n        output = self.proj_to_mel(decoder_output)\n        # output = torch.sigmoid(output)\n        # predict stop token\n        stopnet_input = torch.cat([decoder_output, output], -1)\n        if self.separate_stopnet:\n            stop_token = self.stopnet(stopnet_input.detach())\n        else:\n            stop_token = self.stopnet(stopnet_input)\n        output = output[:, : self.r * self.frame_channels]\n        return output, stop_token, self.attention.attention_weights\n\n    def _update_memory_input(self, new_memory):\n        if self.use_memory_queue:\n            if self.memory_size > self.r:\n                # memory queue size is larger than number of frames per decoder iter\n                self.memory_input = torch.cat(\n                    [new_memory, self.memory_input[:, : (self.memory_size - self.r) * self.frame_channels].clone()],\n                    dim=-1,\n                )\n            else:\n                # memory queue size smaller than number of frames per decoder iter\n                self.memory_input = new_memory[:, : self.memory_size * self.frame_channels]\n        else:\n            # use only the last frame prediction\n            # assert new_memory.shape[-1] == self.r * self.frame_channels\n            self.memory_input = new_memory[:, self.frame_channels * (self.r - 1) :]\n\n    def forward(self, inputs, memory, mask):\n        \"\"\"\n        Args:\n            inputs: Encoder outputs.\n            memory: Decoder memory (autoregression. If None (at eval-time),\n              decoder outputs are used as decoder inputs. If None, it uses the last\n              output as the input.\n            mask: Attention mask for sequence padding.\n\n        Shapes:\n            - inputs: (B, T, D_out_enc)\n            - memory: (B, T_mel, D_mel)\n        \"\"\"\n        # Run greedy decoding if memory is None\n        memory = self._reshape_memory(memory)\n        outputs = []\n        attentions = []\n        stop_tokens = []\n        t = 0\n        self._init_states(inputs)\n        self.attention.init_states(inputs)\n        while len(outputs) < memory.size(0):\n            if t > 0:\n                new_memory = memory[t - 1]\n                self._update_memory_input(new_memory)\n\n            output, stop_token, attention = self.decode(inputs, mask)\n            outputs += [output]\n            attentions += [attention]\n            stop_tokens += [stop_token.squeeze(1)]\n            t += 1\n        return self._parse_outputs(outputs, attentions, stop_tokens)\n\n    def inference(self, inputs):\n        \"\"\"\n        Args:\n            inputs: encoder outputs.\n        Shapes:\n            - inputs: batch x time x encoder_out_dim\n        \"\"\"\n        outputs = []\n        attentions = []\n        stop_tokens = []\n        t = 0\n        self._init_states(inputs)\n        self.attention.init_states(inputs)\n        while True:\n            if t > 0:\n                new_memory = outputs[-1]\n                self._update_memory_input(new_memory)\n            output, stop_token, attention = self.decode(inputs, None)\n            stop_token = torch.sigmoid(stop_token.data)\n            outputs += [output]\n            attentions += [attention]\n            stop_tokens += [stop_token]\n            t += 1\n            if t > inputs.shape[1] / 4 and (stop_token > 0.6 or attention[:, -1].item() > 0.6):\n                break\n            if t > self.max_decoder_steps:\n                print(\"   | > Decoder stopped with 'max_decoder_steps\")\n                break\n        return self._parse_outputs(outputs, attentions, stop_tokens)\n\n\nclass StopNet(nn.Module):\n    r\"\"\"Stopnet signalling decoder to stop inference.\n    Args:\n        in_features (int): feature dimension of input.\n    \"\"\"\n\n    def __init__(self, in_features):\n        super().__init__()\n        self.dropout = nn.Dropout(0.1)\n        self.linear = nn.Linear(in_features, 1)\n        torch.nn.init.xavier_uniform_(self.linear.weight, gain=torch.nn.init.calculate_gain(\"linear\"))\n\n    def forward(self, inputs):\n        outputs = self.dropout(inputs)\n        outputs = self.linear(outputs)\n        return outputs\n"
  },
  {
    "path": "TTS/tts/layers/tacotron/tacotron2.py",
    "content": "import torch\nfrom torch import nn\nfrom torch.nn import functional as F\n\nfrom .attentions import init_attn\nfrom .common_layers import Linear, Prenet\n\n\n# pylint: disable=no-value-for-parameter\n# pylint: disable=unexpected-keyword-arg\nclass ConvBNBlock(nn.Module):\n    r\"\"\"Convolutions with Batch Normalization and non-linear activation.\n\n    Args:\n        in_channels (int): number of input channels.\n        out_channels (int): number of output channels.\n        kernel_size (int): convolution kernel size.\n        activation (str): 'relu', 'tanh', None (linear).\n\n    Shapes:\n        - input: (B, C_in, T)\n        - output: (B, C_out, T)\n    \"\"\"\n\n    def __init__(self, in_channels, out_channels, kernel_size, activation=None):\n        super().__init__()\n        assert (kernel_size - 1) % 2 == 0\n        padding = (kernel_size - 1) // 2\n        self.convolution1d = nn.Conv1d(in_channels, out_channels, kernel_size, padding=padding)\n        self.batch_normalization = nn.BatchNorm1d(out_channels, momentum=0.1, eps=1e-5)\n        self.dropout = nn.Dropout(p=0.5)\n        if activation == \"relu\":\n            self.activation = nn.ReLU()\n        elif activation == \"tanh\":\n            self.activation = nn.Tanh()\n        else:\n            self.activation = nn.Identity()\n\n    def forward(self, x):\n        o = self.convolution1d(x)\n        o = self.batch_normalization(o)\n        o = self.activation(o)\n        o = self.dropout(o)\n        return o\n\n\nclass Postnet(nn.Module):\n    r\"\"\"Tacotron2 Postnet\n\n    Args:\n        in_out_channels (int): number of output channels.\n\n    Shapes:\n        - input: (B, C_in, T)\n        - output: (B, C_in, T)\n    \"\"\"\n\n    def __init__(self, in_out_channels, num_convs=5):\n        super().__init__()\n        self.convolutions = nn.ModuleList()\n        self.convolutions.append(ConvBNBlock(in_out_channels, 512, kernel_size=5, activation=\"tanh\"))\n        for _ in range(1, num_convs - 1):\n            self.convolutions.append(ConvBNBlock(512, 512, kernel_size=5, activation=\"tanh\"))\n        self.convolutions.append(ConvBNBlock(512, in_out_channels, kernel_size=5, activation=None))\n\n    def forward(self, x):\n        o = x\n        for layer in self.convolutions:\n            o = layer(o)\n        return o\n\n\nclass Encoder(nn.Module):\n    r\"\"\"Tacotron2 Encoder\n\n    Args:\n        in_out_channels (int): number of input and output channels.\n\n    Shapes:\n        - input: (B, C_in, T)\n        - output: (B, C_in, T)\n    \"\"\"\n\n    def __init__(self, in_out_channels=512):\n        super().__init__()\n        self.convolutions = nn.ModuleList()\n        for _ in range(3):\n            self.convolutions.append(ConvBNBlock(in_out_channels, in_out_channels, 5, \"relu\"))\n        self.lstm = nn.LSTM(\n            in_out_channels, int(in_out_channels / 2), num_layers=1, batch_first=True, bias=True, bidirectional=True\n        )\n        self.rnn_state = None\n\n    def forward(self, x, input_lengths):\n        o = x\n        for layer in self.convolutions:\n            o = layer(o)\n        o = o.transpose(1, 2)\n        o = nn.utils.rnn.pack_padded_sequence(o, input_lengths.cpu(), batch_first=True)\n        self.lstm.flatten_parameters()\n        o, _ = self.lstm(o)\n        o, _ = nn.utils.rnn.pad_packed_sequence(o, batch_first=True)\n        return o\n\n    def inference(self, x):\n        o = x\n        for layer in self.convolutions:\n            o = layer(o)\n        o = o.transpose(1, 2)\n        # self.lstm.flatten_parameters()\n        o, _ = self.lstm(o)\n        return o\n\n\n# adapted from https://github.com/NVIDIA/tacotron2/\nclass Decoder(nn.Module):\n    \"\"\"Tacotron2 decoder. We don't use Zoneout but Dropout between RNN layers.\n\n    Args:\n        in_channels (int): number of input channels.\n        frame_channels (int): number of feature frame channels.\n        r (int): number of outputs per time step (reduction rate).\n        memory_size (int): size of the past window. if <= 0 memory_size = r\n        attn_type (string): type of attention used in decoder.\n        attn_win (bool): if true, define an attention window centered to maximum\n            attention response. It provides more robust attention alignment especially\n            at interence time.\n        attn_norm (string): attention normalization function. 'sigmoid' or 'softmax'.\n        prenet_type (string): 'original' or 'bn'.\n        prenet_dropout (float): prenet dropout rate.\n        forward_attn (bool): if true, use forward attention method. https://arxiv.org/abs/1807.06736\n        trans_agent (bool): if true, use transition agent. https://arxiv.org/abs/1807.06736\n        forward_attn_mask (bool): if true, mask attention values smaller than a threshold.\n        location_attn (bool): if true, use location sensitive attention.\n        attn_K (int): number of attention heads for GravesAttention.\n        separate_stopnet (bool): if true, detach stopnet input to prevent gradient flow.\n        max_decoder_steps (int): Maximum number of steps allowed for the decoder. Defaults to 10000.\n    \"\"\"\n\n    # Pylint gets confused by PyTorch conventions here\n    # pylint: disable=attribute-defined-outside-init\n    def __init__(\n        self,\n        in_channels,\n        frame_channels,\n        r,\n        attn_type,\n        attn_win,\n        attn_norm,\n        prenet_type,\n        prenet_dropout,\n        forward_attn,\n        trans_agent,\n        forward_attn_mask,\n        location_attn,\n        attn_K,\n        separate_stopnet,\n        max_decoder_steps,\n    ):\n        super().__init__()\n        self.frame_channels = frame_channels\n        self.r_init = r\n        self.r = r\n        self.encoder_embedding_dim = in_channels\n        self.separate_stopnet = separate_stopnet\n        self.max_decoder_steps = max_decoder_steps\n        self.stop_threshold = 0.5\n\n        # model dimensions\n        self.query_dim = 1024\n        self.decoder_rnn_dim = 1024\n        self.prenet_dim = 256\n        self.attn_dim = 128\n        self.p_attention_dropout = 0.1\n        self.p_decoder_dropout = 0.1\n\n        # memory -> |Prenet| -> processed_memory\n        prenet_dim = self.frame_channels\n        self.prenet = Prenet(\n            prenet_dim, prenet_type, prenet_dropout, out_features=[self.prenet_dim, self.prenet_dim], bias=False\n        )\n\n        self.attention_rnn = nn.LSTMCell(self.prenet_dim + in_channels, self.query_dim, bias=True)\n\n        self.attention = init_attn(\n            attn_type=attn_type,\n            query_dim=self.query_dim,\n            embedding_dim=in_channels,\n            attention_dim=128,\n            location_attention=location_attn,\n            attention_location_n_filters=32,\n            attention_location_kernel_size=31,\n            windowing=attn_win,\n            norm=attn_norm,\n            forward_attn=forward_attn,\n            trans_agent=trans_agent,\n            forward_attn_mask=forward_attn_mask,\n            attn_K=attn_K,\n        )\n\n        self.decoder_rnn = nn.LSTMCell(self.query_dim + in_channels, self.decoder_rnn_dim, bias=True)\n\n        self.linear_projection = Linear(self.decoder_rnn_dim + in_channels, self.frame_channels * self.r_init)\n\n        self.stopnet = nn.Sequential(\n            nn.Dropout(0.1),\n            Linear(self.decoder_rnn_dim + self.frame_channels * self.r_init, 1, bias=True, init_gain=\"sigmoid\"),\n        )\n        self.memory_truncated = None\n\n    def set_r(self, new_r):\n        self.r = new_r\n\n    def get_go_frame(self, inputs):\n        B = inputs.size(0)\n        memory = torch.zeros(1, device=inputs.device).repeat(B, self.frame_channels * self.r)\n        return memory\n\n    def _init_states(self, inputs, mask, keep_states=False):\n        B = inputs.size(0)\n        # T = inputs.size(1)\n        if not keep_states:\n            self.query = torch.zeros(1, device=inputs.device).repeat(B, self.query_dim)\n            self.attention_rnn_cell_state = torch.zeros(1, device=inputs.device).repeat(B, self.query_dim)\n            self.decoder_hidden = torch.zeros(1, device=inputs.device).repeat(B, self.decoder_rnn_dim)\n            self.decoder_cell = torch.zeros(1, device=inputs.device).repeat(B, self.decoder_rnn_dim)\n            self.context = torch.zeros(1, device=inputs.device).repeat(B, self.encoder_embedding_dim)\n        self.inputs = inputs\n        self.processed_inputs = self.attention.preprocess_inputs(inputs)\n        self.mask = mask\n\n    def _reshape_memory(self, memory):\n        \"\"\"\n        Reshape the spectrograms for given 'r'\n        \"\"\"\n        # Grouping multiple frames if necessary\n        if memory.size(-1) == self.frame_channels:\n            memory = memory.view(memory.shape[0], memory.size(1) // self.r, -1)\n        # Time first (T_decoder, B, frame_channels)\n        memory = memory.transpose(0, 1)\n        return memory\n\n    def _parse_outputs(self, outputs, stop_tokens, alignments):\n        alignments = torch.stack(alignments).transpose(0, 1)\n        stop_tokens = torch.stack(stop_tokens).transpose(0, 1)\n        outputs = torch.stack(outputs).transpose(0, 1).contiguous()\n        outputs = outputs.view(outputs.size(0), -1, self.frame_channels)\n        outputs = outputs.transpose(1, 2)\n        return outputs, stop_tokens, alignments\n\n    def _update_memory(self, memory):\n        if len(memory.shape) == 2:\n            return memory[:, self.frame_channels * (self.r - 1) :]\n        return memory[:, :, self.frame_channels * (self.r - 1) :]\n\n    def decode(self, memory):\n        \"\"\"\n        shapes:\n           - memory: B x r * self.frame_channels\n        \"\"\"\n        # self.context: B x D_en\n        # query_input: B x D_en + (r * self.frame_channels)\n        query_input = torch.cat((memory, self.context), -1)\n        # self.query and self.attention_rnn_cell_state : B x D_attn_rnn\n        self.query, self.attention_rnn_cell_state = self.attention_rnn(\n            query_input, (self.query, self.attention_rnn_cell_state)\n        )\n        self.query = F.dropout(self.query, self.p_attention_dropout, self.training)\n        self.attention_rnn_cell_state = F.dropout(\n            self.attention_rnn_cell_state, self.p_attention_dropout, self.training\n        )\n        # B x D_en\n        self.context = self.attention(self.query, self.inputs, self.processed_inputs, self.mask)\n        # B x (D_en + D_attn_rnn)\n        decoder_rnn_input = torch.cat((self.query, self.context), -1)\n        # self.decoder_hidden and self.decoder_cell: B x D_decoder_rnn\n        self.decoder_hidden, self.decoder_cell = self.decoder_rnn(\n            decoder_rnn_input, (self.decoder_hidden, self.decoder_cell)\n        )\n        self.decoder_hidden = F.dropout(self.decoder_hidden, self.p_decoder_dropout, self.training)\n        # B x (D_decoder_rnn + D_en)\n        decoder_hidden_context = torch.cat((self.decoder_hidden, self.context), dim=1)\n        # B x (self.r * self.frame_channels)\n        decoder_output = self.linear_projection(decoder_hidden_context)\n        # B x (D_decoder_rnn + (self.r * self.frame_channels))\n        stopnet_input = torch.cat((self.decoder_hidden, decoder_output), dim=1)\n        if self.separate_stopnet:\n            stop_token = self.stopnet(stopnet_input.detach())\n        else:\n            stop_token = self.stopnet(stopnet_input)\n        # select outputs for the reduction rate self.r\n        decoder_output = decoder_output[:, : self.r * self.frame_channels]\n        return decoder_output, self.attention.attention_weights, stop_token\n\n    def forward(self, inputs, memories, mask):\n        r\"\"\"Train Decoder with teacher forcing.\n        Args:\n            inputs: Encoder outputs.\n            memories: Feature frames for teacher-forcing.\n            mask: Attention mask for sequence padding.\n\n        Shapes:\n            - inputs: (B, T, D_out_enc)\n            - memory: (B, T_mel, D_mel)\n            - outputs: (B, T_mel, D_mel)\n            - alignments: (B, T_in, T_out)\n            - stop_tokens: (B, T_out)\n        \"\"\"\n        memory = self.get_go_frame(inputs).unsqueeze(0)\n        memories = self._reshape_memory(memories)\n        memories = torch.cat((memory, memories), dim=0)\n        memories = self._update_memory(memories)\n        memories = self.prenet(memories)\n\n        self._init_states(inputs, mask=mask)\n        self.attention.init_states(inputs)\n\n        outputs, stop_tokens, alignments = [], [], []\n        while len(outputs) < memories.size(0) - 1:\n            memory = memories[len(outputs)]\n            decoder_output, attention_weights, stop_token = self.decode(memory)\n            outputs += [decoder_output.squeeze(1)]\n            stop_tokens += [stop_token.squeeze(1)]\n            alignments += [attention_weights]\n\n        outputs, stop_tokens, alignments = self._parse_outputs(outputs, stop_tokens, alignments)\n        return outputs, alignments, stop_tokens\n\n    def inference(self, inputs):\n        r\"\"\"Decoder inference without teacher forcing and use\n        Stopnet to stop decoder.\n        Args:\n            inputs: Encoder outputs.\n\n        Shapes:\n            - inputs: (B, T, D_out_enc)\n            - outputs: (B, T_mel, D_mel)\n            - alignments: (B, T_in, T_out)\n            - stop_tokens: (B, T_out)\n        \"\"\"\n        memory = self.get_go_frame(inputs)\n        memory = self._update_memory(memory)\n\n        self._init_states(inputs, mask=None)\n        self.attention.init_states(inputs)\n\n        outputs, stop_tokens, alignments, t = [], [], [], 0\n        while True:\n            memory = self.prenet(memory)\n            decoder_output, alignment, stop_token = self.decode(memory)\n            stop_token = torch.sigmoid(stop_token.data)\n            outputs += [decoder_output.squeeze(1)]\n            stop_tokens += [stop_token]\n            alignments += [alignment]\n\n            if stop_token > self.stop_threshold and t > inputs.shape[0] // 2:\n                break\n            if len(outputs) == self.max_decoder_steps:\n                print(f\"   > Decoder stopped with `max_decoder_steps` {self.max_decoder_steps}\")\n                break\n\n            memory = self._update_memory(decoder_output)\n            t += 1\n\n        outputs, stop_tokens, alignments = self._parse_outputs(outputs, stop_tokens, alignments)\n\n        return outputs, alignments, stop_tokens\n\n    def inference_truncated(self, inputs):\n        \"\"\"\n        Preserve decoder states for continuous inference\n        \"\"\"\n        if self.memory_truncated is None:\n            self.memory_truncated = self.get_go_frame(inputs)\n            self._init_states(inputs, mask=None, keep_states=False)\n        else:\n            self._init_states(inputs, mask=None, keep_states=True)\n\n        self.attention.init_states(inputs)\n        outputs, stop_tokens, alignments, t = [], [], [], 0\n        while True:\n            memory = self.prenet(self.memory_truncated)\n            decoder_output, alignment, stop_token = self.decode(memory)\n            stop_token = torch.sigmoid(stop_token.data)\n            outputs += [decoder_output.squeeze(1)]\n            stop_tokens += [stop_token]\n            alignments += [alignment]\n\n            if stop_token > 0.7:\n                break\n            if len(outputs) == self.max_decoder_steps:\n                print(\"   | > Decoder stopped with 'max_decoder_steps\")\n                break\n\n            self.memory_truncated = decoder_output\n            t += 1\n\n        outputs, stop_tokens, alignments = self._parse_outputs(outputs, stop_tokens, alignments)\n\n        return outputs, alignments, stop_tokens\n\n    def inference_step(self, inputs, t, memory=None):\n        \"\"\"\n        For debug purposes\n        \"\"\"\n        if t == 0:\n            memory = self.get_go_frame(inputs)\n            self._init_states(inputs, mask=None)\n\n        memory = self.prenet(memory)\n        decoder_output, stop_token, alignment = self.decode(memory)\n        stop_token = torch.sigmoid(stop_token.data)\n        memory = decoder_output\n        return decoder_output, stop_token, alignment\n"
  },
  {
    "path": "TTS/tts/layers/vits/discriminator.py",
    "content": "import torch\nfrom torch import nn\nfrom torch.nn.modules.conv import Conv1d\n\nfrom TTS.vocoder.models.hifigan_discriminator import DiscriminatorP, MultiPeriodDiscriminator\n\n\nclass DiscriminatorS(torch.nn.Module):\n    \"\"\"HiFiGAN Scale Discriminator. Channel sizes are different from the original HiFiGAN.\n\n    Args:\n        use_spectral_norm (bool): if `True` swith to spectral norm instead of weight norm.\n    \"\"\"\n\n    def __init__(self, use_spectral_norm=False):\n        super().__init__()\n        norm_f = nn.utils.spectral_norm if use_spectral_norm else nn.utils.weight_norm\n        self.convs = nn.ModuleList(\n            [\n                norm_f(Conv1d(1, 16, 15, 1, padding=7)),\n                norm_f(Conv1d(16, 64, 41, 4, groups=4, padding=20)),\n                norm_f(Conv1d(64, 256, 41, 4, groups=16, padding=20)),\n                norm_f(Conv1d(256, 1024, 41, 4, groups=64, padding=20)),\n                norm_f(Conv1d(1024, 1024, 41, 4, groups=256, padding=20)),\n                norm_f(Conv1d(1024, 1024, 5, 1, padding=2)),\n            ]\n        )\n        self.conv_post = norm_f(Conv1d(1024, 1, 3, 1, padding=1))\n\n    def forward(self, x):\n        \"\"\"\n        Args:\n            x (Tensor): input waveform.\n\n        Returns:\n            Tensor: discriminator scores.\n            List[Tensor]: list of features from the convolutiona layers.\n        \"\"\"\n        feat = []\n        for l in self.convs:\n            x = l(x)\n            x = torch.nn.functional.leaky_relu(x, 0.1)\n            feat.append(x)\n        x = self.conv_post(x)\n        feat.append(x)\n        x = torch.flatten(x, 1, -1)\n        return x, feat\n\n\nclass VitsDiscriminator(nn.Module):\n    \"\"\"VITS discriminator wrapping one Scale Discriminator and a stack of Period Discriminator.\n\n    ::\n        waveform -> ScaleDiscriminator() -> scores_sd, feats_sd --> append() -> scores, feats\n               |--> MultiPeriodDiscriminator() -> scores_mpd, feats_mpd ^\n\n    Args:\n        use_spectral_norm (bool): if `True` swith to spectral norm instead of weight norm.\n    \"\"\"\n\n    def __init__(self, periods=(2, 3, 5, 7, 11), use_spectral_norm=False):\n        super().__init__()\n        self.nets = nn.ModuleList()\n        self.nets.append(DiscriminatorS(use_spectral_norm=use_spectral_norm))\n        self.nets.extend([DiscriminatorP(i, use_spectral_norm=use_spectral_norm) for i in periods])\n\n    def forward(self, x, x_hat=None):\n        \"\"\"\n        Args:\n            x (Tensor): ground truth waveform.\n            x_hat (Tensor): predicted waveform.\n\n        Returns:\n            List[Tensor]: discriminator scores.\n            List[List[Tensor]]: list of list of features from each layers of each discriminator.\n        \"\"\"\n        x_scores = []\n        x_hat_scores = [] if x_hat is not None else None\n        x_feats = []\n        x_hat_feats = [] if x_hat is not None else None\n        for net in self.nets:\n            x_score, x_feat = net(x)\n            x_scores.append(x_score)\n            x_feats.append(x_feat)\n            if x_hat is not None:\n                x_hat_score, x_hat_feat = net(x_hat)\n                x_hat_scores.append(x_hat_score)\n                x_hat_feats.append(x_hat_feat)\n        return x_scores, x_feats, x_hat_scores, x_hat_feats\n"
  },
  {
    "path": "TTS/tts/layers/vits/networks.py",
    "content": "import math\n\nimport torch\nfrom torch import nn\n\nfrom TTS.tts.layers.glow_tts.glow import WN\nfrom TTS.tts.layers.glow_tts.transformer import RelativePositionTransformer\nfrom TTS.tts.utils.helpers import sequence_mask\n\nLRELU_SLOPE = 0.1\n\n\ndef convert_pad_shape(pad_shape):\n    l = pad_shape[::-1]\n    pad_shape = [item for sublist in l for item in sublist]\n    return pad_shape\n\n\ndef init_weights(m, mean=0.0, std=0.01):\n    classname = m.__class__.__name__\n    if classname.find(\"Conv\") != -1:\n        m.weight.data.normal_(mean, std)\n\n\ndef get_padding(kernel_size, dilation=1):\n    return int((kernel_size * dilation - dilation) / 2)\n\n\nclass TextEncoder(nn.Module):\n    def __init__(\n        self,\n        n_vocab: int,\n        out_channels: int,\n        hidden_channels: int,\n        hidden_channels_ffn: int,\n        num_heads: int,\n        num_layers: int,\n        kernel_size: int,\n        dropout_p: float,\n        language_emb_dim: int = None,\n    ):\n        \"\"\"Text Encoder for VITS model.\n\n        Args:\n            n_vocab (int): Number of characters for the embedding layer.\n            out_channels (int): Number of channels for the output.\n            hidden_channels (int): Number of channels for the hidden layers.\n            hidden_channels_ffn (int): Number of channels for the convolutional layers.\n            num_heads (int): Number of attention heads for the Transformer layers.\n            num_layers (int): Number of Transformer layers.\n            kernel_size (int): Kernel size for the FFN layers in Transformer network.\n            dropout_p (float): Dropout rate for the Transformer layers.\n        \"\"\"\n        super().__init__()\n        self.out_channels = out_channels\n        self.hidden_channels = hidden_channels\n\n        self.emb = nn.Embedding(n_vocab, hidden_channels)\n\n        nn.init.normal_(self.emb.weight, 0.0, hidden_channels**-0.5)\n\n        if language_emb_dim:\n            hidden_channels += language_emb_dim\n\n        self.encoder = RelativePositionTransformer(\n            in_channels=hidden_channels,\n            out_channels=hidden_channels,\n            hidden_channels=hidden_channels,\n            hidden_channels_ffn=hidden_channels_ffn,\n            num_heads=num_heads,\n            num_layers=num_layers,\n            kernel_size=kernel_size,\n            dropout_p=dropout_p,\n            layer_norm_type=\"2\",\n            rel_attn_window_size=4,\n        )\n\n        self.proj = nn.Conv1d(hidden_channels, out_channels * 2, 1)\n\n    def forward(self, x, x_lengths, lang_emb=None):\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, T]`\n            - x_length: :math:`[B]`\n        \"\"\"\n        assert x.shape[0] == x_lengths.shape[0]\n        x = self.emb(x) * math.sqrt(self.hidden_channels)  # [b, t, h]\n\n        # concat the lang emb in embedding chars\n        if lang_emb is not None:\n            x = torch.cat((x, lang_emb.transpose(2, 1).expand(x.size(0), x.size(1), -1)), dim=-1)\n\n        x = torch.transpose(x, 1, -1)  # [b, h, t]\n        x_mask = torch.unsqueeze(sequence_mask(x_lengths, x.size(2)), 1).to(x.dtype)  # [b, 1, t]\n\n        x = self.encoder(x * x_mask, x_mask)\n        stats = self.proj(x) * x_mask\n\n        m, logs = torch.split(stats, self.out_channels, dim=1)\n        return x, m, logs, x_mask\n\n\nclass ResidualCouplingBlock(nn.Module):\n    def __init__(\n        self,\n        channels,\n        hidden_channels,\n        kernel_size,\n        dilation_rate,\n        num_layers,\n        dropout_p=0,\n        cond_channels=0,\n        mean_only=False,\n    ):\n        assert channels % 2 == 0, \"channels should be divisible by 2\"\n        super().__init__()\n        self.half_channels = channels // 2\n        self.mean_only = mean_only\n        # input layer\n        self.pre = nn.Conv1d(self.half_channels, hidden_channels, 1)\n        # coupling layers\n        self.enc = WN(\n            hidden_channels,\n            hidden_channels,\n            kernel_size,\n            dilation_rate,\n            num_layers,\n            dropout_p=dropout_p,\n            c_in_channels=cond_channels,\n        )\n        # output layer\n        # Initializing last layer to 0 makes the affine coupling layers\n        # do nothing at first.  This helps with training stability\n        self.post = nn.Conv1d(hidden_channels, self.half_channels * (2 - mean_only), 1)\n        self.post.weight.data.zero_()\n        self.post.bias.data.zero_()\n\n    def forward(self, x, x_mask, g=None, reverse=False):\n        \"\"\"\n        Note:\n            Set `reverse` to True for inference.\n\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - x_mask: :math:`[B, 1, T]`\n            - g: :math:`[B, C, 1]`\n        \"\"\"\n        x0, x1 = torch.split(x, [self.half_channels] * 2, 1)\n        h = self.pre(x0) * x_mask\n        h = self.enc(h, x_mask, g=g)\n        stats = self.post(h) * x_mask\n        if not self.mean_only:\n            m, log_scale = torch.split(stats, [self.half_channels] * 2, 1)\n        else:\n            m = stats\n            log_scale = torch.zeros_like(m)\n\n        if not reverse:\n            x1 = m + x1 * torch.exp(log_scale) * x_mask\n            x = torch.cat([x0, x1], 1)\n            logdet = torch.sum(log_scale, [1, 2])\n            return x, logdet\n        else:\n            x1 = (x1 - m) * torch.exp(-log_scale) * x_mask\n            x = torch.cat([x0, x1], 1)\n            return x\n\n\nclass ResidualCouplingBlocks(nn.Module):\n    def __init__(\n        self,\n        channels: int,\n        hidden_channels: int,\n        kernel_size: int,\n        dilation_rate: int,\n        num_layers: int,\n        num_flows=4,\n        cond_channels=0,\n    ):\n        \"\"\"Redisual Coupling blocks for VITS flow layers.\n\n        Args:\n            channels (int): Number of input and output tensor channels.\n            hidden_channels (int): Number of hidden network channels.\n            kernel_size (int): Kernel size of the WaveNet layers.\n            dilation_rate (int): Dilation rate of the WaveNet layers.\n            num_layers (int): Number of the WaveNet layers.\n            num_flows (int, optional): Number of Residual Coupling blocks. Defaults to 4.\n            cond_channels (int, optional): Number of channels of the conditioning tensor. Defaults to 0.\n        \"\"\"\n        super().__init__()\n        self.channels = channels\n        self.hidden_channels = hidden_channels\n        self.kernel_size = kernel_size\n        self.dilation_rate = dilation_rate\n        self.num_layers = num_layers\n        self.num_flows = num_flows\n        self.cond_channels = cond_channels\n\n        self.flows = nn.ModuleList()\n        for _ in range(num_flows):\n            self.flows.append(\n                ResidualCouplingBlock(\n                    channels,\n                    hidden_channels,\n                    kernel_size,\n                    dilation_rate,\n                    num_layers,\n                    cond_channels=cond_channels,\n                    mean_only=True,\n                )\n            )\n\n    def forward(self, x, x_mask, g=None, reverse=False):\n        \"\"\"\n        Note:\n            Set `reverse` to True for inference.\n\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - x_mask: :math:`[B, 1, T]`\n            - g: :math:`[B, C, 1]`\n        \"\"\"\n        if not reverse:\n            for flow in self.flows:\n                x, _ = flow(x, x_mask, g=g, reverse=reverse)\n                x = torch.flip(x, [1])\n        else:\n            for flow in reversed(self.flows):\n                x = torch.flip(x, [1])\n                x = flow(x, x_mask, g=g, reverse=reverse)\n        return x\n\n\nclass PosteriorEncoder(nn.Module):\n    def __init__(\n        self,\n        in_channels: int,\n        out_channels: int,\n        hidden_channels: int,\n        kernel_size: int,\n        dilation_rate: int,\n        num_layers: int,\n        cond_channels=0,\n    ):\n        \"\"\"Posterior Encoder of VITS model.\n\n        ::\n            x -> conv1x1() -> WaveNet() (non-causal) -> conv1x1() -> split() -> [m, s] -> sample(m, s) -> z\n\n        Args:\n            in_channels (int): Number of input tensor channels.\n            out_channels (int): Number of output tensor channels.\n            hidden_channels (int): Number of hidden channels.\n            kernel_size (int): Kernel size of the WaveNet convolution layers.\n            dilation_rate (int): Dilation rate of the WaveNet layers.\n            num_layers (int): Number of the WaveNet layers.\n            cond_channels (int, optional): Number of conditioning tensor channels. Defaults to 0.\n        \"\"\"\n        super().__init__()\n        self.in_channels = in_channels\n        self.out_channels = out_channels\n        self.hidden_channels = hidden_channels\n        self.kernel_size = kernel_size\n        self.dilation_rate = dilation_rate\n        self.num_layers = num_layers\n        self.cond_channels = cond_channels\n\n        self.pre = nn.Conv1d(in_channels, hidden_channels, 1)\n        self.enc = WN(\n            hidden_channels, hidden_channels, kernel_size, dilation_rate, num_layers, c_in_channels=cond_channels\n        )\n        self.proj = nn.Conv1d(hidden_channels, out_channels * 2, 1)\n\n    def forward(self, x, x_lengths, g=None):\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - x_lengths: :math:`[B, 1]`\n            - g: :math:`[B, C, 1]`\n        \"\"\"\n        x_mask = torch.unsqueeze(sequence_mask(x_lengths, x.size(2)), 1).to(x.dtype)\n        x = self.pre(x) * x_mask\n        x = self.enc(x, x_mask, g=g)\n        stats = self.proj(x) * x_mask\n        mean, log_scale = torch.split(stats, self.out_channels, dim=1)\n        z = (mean + torch.randn_like(mean) * torch.exp(log_scale)) * x_mask\n        return z, mean, log_scale, x_mask\n"
  },
  {
    "path": "TTS/tts/layers/vits/stochastic_duration_predictor.py",
    "content": "import math\n\nimport torch\nfrom torch import nn\nfrom torch.nn import functional as F\n\nfrom TTS.tts.layers.generic.normalization import LayerNorm2\nfrom TTS.tts.layers.vits.transforms import piecewise_rational_quadratic_transform\n\n\nclass DilatedDepthSeparableConv(nn.Module):\n    def __init__(self, channels, kernel_size, num_layers, dropout_p=0.0) -> torch.tensor:\n        \"\"\"Dilated Depth-wise Separable Convolution module.\n\n        ::\n            x |-> DDSConv(x) -> LayerNorm(x) -> GeLU(x) -> Conv1x1(x) -> LayerNorm(x) -> GeLU(x) -> + -> o\n              |-------------------------------------------------------------------------------------^\n\n        Args:\n            channels ([type]): [description]\n            kernel_size ([type]): [description]\n            num_layers ([type]): [description]\n            dropout_p (float, optional): [description]. Defaults to 0.0.\n\n        Returns:\n            torch.tensor: Network output masked by the input sequence mask.\n        \"\"\"\n        super().__init__()\n        self.num_layers = num_layers\n\n        self.convs_sep = nn.ModuleList()\n        self.convs_1x1 = nn.ModuleList()\n        self.norms_1 = nn.ModuleList()\n        self.norms_2 = nn.ModuleList()\n        for i in range(num_layers):\n            dilation = kernel_size**i\n            padding = (kernel_size * dilation - dilation) // 2\n            self.convs_sep.append(\n                nn.Conv1d(channels, channels, kernel_size, groups=channels, dilation=dilation, padding=padding)\n            )\n            self.convs_1x1.append(nn.Conv1d(channels, channels, 1))\n            self.norms_1.append(LayerNorm2(channels))\n            self.norms_2.append(LayerNorm2(channels))\n        self.dropout = nn.Dropout(dropout_p)\n\n    def forward(self, x, x_mask, g=None):\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - x_mask: :math:`[B, 1, T]`\n        \"\"\"\n        if g is not None:\n            x = x + g\n        for i in range(self.num_layers):\n            y = self.convs_sep[i](x * x_mask)\n            y = self.norms_1[i](y)\n            y = F.gelu(y)\n            y = self.convs_1x1[i](y)\n            y = self.norms_2[i](y)\n            y = F.gelu(y)\n            y = self.dropout(y)\n            x = x + y\n        return x * x_mask\n\n\nclass ElementwiseAffine(nn.Module):\n    \"\"\"Element-wise affine transform like no-population stats BatchNorm alternative.\n\n    Args:\n        channels (int): Number of input tensor channels.\n    \"\"\"\n\n    def __init__(self, channels):\n        super().__init__()\n        self.translation = nn.Parameter(torch.zeros(channels, 1))\n        self.log_scale = nn.Parameter(torch.zeros(channels, 1))\n\n    def forward(self, x, x_mask, reverse=False, **kwargs):  # pylint: disable=unused-argument\n        if not reverse:\n            y = (x * torch.exp(self.log_scale) + self.translation) * x_mask\n            logdet = torch.sum(self.log_scale * x_mask, [1, 2])\n            return y, logdet\n        x = (x - self.translation) * torch.exp(-self.log_scale) * x_mask\n        return x\n\n\nclass ConvFlow(nn.Module):\n    \"\"\"Dilated depth separable convolutional based spline flow.\n\n    Args:\n        in_channels (int): Number of input tensor channels.\n        hidden_channels (int): Number of in network channels.\n        kernel_size (int): Convolutional kernel size.\n        num_layers (int): Number of convolutional layers.\n        num_bins (int, optional): Number of spline bins. Defaults to 10.\n        tail_bound (float, optional): Tail bound for PRQT. Defaults to 5.0.\n    \"\"\"\n\n    def __init__(\n        self,\n        in_channels: int,\n        hidden_channels: int,\n        kernel_size: int,\n        num_layers: int,\n        num_bins=10,\n        tail_bound=5.0,\n    ):\n        super().__init__()\n        self.num_bins = num_bins\n        self.tail_bound = tail_bound\n        self.hidden_channels = hidden_channels\n        self.half_channels = in_channels // 2\n\n        self.pre = nn.Conv1d(self.half_channels, hidden_channels, 1)\n        self.convs = DilatedDepthSeparableConv(hidden_channels, kernel_size, num_layers, dropout_p=0.0)\n        self.proj = nn.Conv1d(hidden_channels, self.half_channels * (num_bins * 3 - 1), 1)\n        self.proj.weight.data.zero_()\n        self.proj.bias.data.zero_()\n\n    def forward(self, x, x_mask, g=None, reverse=False):\n        x0, x1 = torch.split(x, [self.half_channels] * 2, 1)\n        h = self.pre(x0)\n        h = self.convs(h, x_mask, g=g)\n        h = self.proj(h) * x_mask\n\n        b, c, t = x0.shape\n        h = h.reshape(b, c, -1, t).permute(0, 1, 3, 2)  # [b, cx?, t] -> [b, c, t, ?]\n\n        unnormalized_widths = h[..., : self.num_bins] / math.sqrt(self.hidden_channels)\n        unnormalized_heights = h[..., self.num_bins : 2 * self.num_bins] / math.sqrt(self.hidden_channels)\n        unnormalized_derivatives = h[..., 2 * self.num_bins :]\n\n        x1, logabsdet = piecewise_rational_quadratic_transform(\n            x1,\n            unnormalized_widths,\n            unnormalized_heights,\n            unnormalized_derivatives,\n            inverse=reverse,\n            tails=\"linear\",\n            tail_bound=self.tail_bound,\n        )\n\n        x = torch.cat([x0, x1], 1) * x_mask\n        logdet = torch.sum(logabsdet * x_mask, [1, 2])\n        if not reverse:\n            return x, logdet\n        return x\n\n\nclass StochasticDurationPredictor(nn.Module):\n    \"\"\"Stochastic duration predictor with Spline Flows.\n\n    It applies Variational Dequantization and Variationsl Data Augmentation.\n\n    Paper:\n        SDP: https://arxiv.org/pdf/2106.06103.pdf\n        Spline Flow: https://arxiv.org/abs/1906.04032\n\n    ::\n        ## Inference\n\n        x -> TextCondEncoder() -> Flow() -> dr_hat\n        noise ----------------------^\n\n        ## Training\n                                                                              |---------------------|\n        x -> TextCondEncoder() -> + -> PosteriorEncoder() -> split() -> z_u, z_v -> (d - z_u) -> concat() -> Flow() -> noise\n        d -> DurCondEncoder()  -> ^                                                    |\n        |------------------------------------------------------------------------------|\n\n    Args:\n        in_channels (int): Number of input tensor channels.\n        hidden_channels (int): Number of hidden channels.\n        kernel_size (int): Kernel size of convolutional layers.\n        dropout_p (float): Dropout rate.\n        num_flows (int, optional): Number of flow blocks. Defaults to 4.\n        cond_channels (int, optional): Number of channels of conditioning tensor. Defaults to 0.\n    \"\"\"\n\n    def __init__(\n        self,\n        in_channels: int,\n        hidden_channels: int,\n        kernel_size: int,\n        dropout_p: float,\n        num_flows=4,\n        cond_channels=0,\n        language_emb_dim=0,\n    ):\n        super().__init__()\n\n        # add language embedding dim in the input\n        if language_emb_dim:\n            in_channels += language_emb_dim\n\n        # condition encoder text\n        self.pre = nn.Conv1d(in_channels, hidden_channels, 1)\n        self.convs = DilatedDepthSeparableConv(hidden_channels, kernel_size, num_layers=3, dropout_p=dropout_p)\n        self.proj = nn.Conv1d(hidden_channels, hidden_channels, 1)\n\n        # posterior encoder\n        self.flows = nn.ModuleList()\n        self.flows.append(ElementwiseAffine(2))\n        self.flows += [ConvFlow(2, hidden_channels, kernel_size, num_layers=3) for _ in range(num_flows)]\n\n        # condition encoder duration\n        self.post_pre = nn.Conv1d(1, hidden_channels, 1)\n        self.post_convs = DilatedDepthSeparableConv(hidden_channels, kernel_size, num_layers=3, dropout_p=dropout_p)\n        self.post_proj = nn.Conv1d(hidden_channels, hidden_channels, 1)\n\n        # flow layers\n        self.post_flows = nn.ModuleList()\n        self.post_flows.append(ElementwiseAffine(2))\n        self.post_flows += [ConvFlow(2, hidden_channels, kernel_size, num_layers=3) for _ in range(num_flows)]\n\n        if cond_channels != 0 and cond_channels is not None:\n            self.cond = nn.Conv1d(cond_channels, hidden_channels, 1)\n\n        if language_emb_dim != 0 and language_emb_dim is not None:\n            self.cond_lang = nn.Conv1d(language_emb_dim, hidden_channels, 1)\n\n    def forward(self, x, x_mask, dr=None, g=None, lang_emb=None, reverse=False, noise_scale=1.0):\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, C, T]`\n            - x_mask: :math:`[B, 1, T]`\n            - dr: :math:`[B, 1, T]`\n            - g: :math:`[B, C]`\n        \"\"\"\n        # condition encoder text\n        x = self.pre(x)\n        if g is not None:\n            x = x + self.cond(g)\n\n        if lang_emb is not None:\n            x = x + self.cond_lang(lang_emb)\n\n        x = self.convs(x, x_mask)\n        x = self.proj(x) * x_mask\n\n        if not reverse:\n            flows = self.flows\n            assert dr is not None\n\n            # condition encoder duration\n            h = self.post_pre(dr)\n            h = self.post_convs(h, x_mask)\n            h = self.post_proj(h) * x_mask\n            noise = torch.randn(dr.size(0), 2, dr.size(2)).to(device=x.device, dtype=x.dtype) * x_mask\n            z_q = noise\n\n            # posterior encoder\n            logdet_tot_q = 0.0\n            for idx, flow in enumerate(self.post_flows):\n                z_q, logdet_q = flow(z_q, x_mask, g=(x + h))\n                logdet_tot_q = logdet_tot_q + logdet_q\n                if idx > 0:\n                    z_q = torch.flip(z_q, [1])\n\n            z_u, z_v = torch.split(z_q, [1, 1], 1)\n            u = torch.sigmoid(z_u) * x_mask\n            z0 = (dr - u) * x_mask\n\n            # posterior encoder - neg log likelihood\n            logdet_tot_q += torch.sum((F.logsigmoid(z_u) + F.logsigmoid(-z_u)) * x_mask, [1, 2])\n            nll_posterior_encoder = (\n                torch.sum(-0.5 * (math.log(2 * math.pi) + (noise**2)) * x_mask, [1, 2]) - logdet_tot_q\n            )\n\n            z0 = torch.log(torch.clamp_min(z0, 1e-5)) * x_mask\n            logdet_tot = torch.sum(-z0, [1, 2])\n            z = torch.cat([z0, z_v], 1)\n\n            # flow layers\n            for idx, flow in enumerate(flows):\n                z, logdet = flow(z, x_mask, g=x, reverse=reverse)\n                logdet_tot = logdet_tot + logdet\n                if idx > 0:\n                    z = torch.flip(z, [1])\n\n            # flow layers - neg log likelihood\n            nll_flow_layers = torch.sum(0.5 * (math.log(2 * math.pi) + (z**2)) * x_mask, [1, 2]) - logdet_tot\n            return nll_flow_layers + nll_posterior_encoder\n\n        flows = list(reversed(self.flows))\n        flows = flows[:-2] + [flows[-1]]  # remove a useless vflow\n        z = torch.randn(x.size(0), 2, x.size(2)).to(device=x.device, dtype=x.dtype) * noise_scale\n        for flow in flows:\n            z = torch.flip(z, [1])\n            z = flow(z, x_mask, g=x, reverse=reverse)\n\n        z0, _ = torch.split(z, [1, 1], 1)\n        logw = z0\n        return logw\n"
  },
  {
    "path": "TTS/tts/layers/vits/transforms.py",
    "content": "# adopted from https://github.com/bayesiains/nflows\n\nimport numpy as np\nimport torch\nfrom torch.nn import functional as F\n\nDEFAULT_MIN_BIN_WIDTH = 1e-3\nDEFAULT_MIN_BIN_HEIGHT = 1e-3\nDEFAULT_MIN_DERIVATIVE = 1e-3\n\n\ndef piecewise_rational_quadratic_transform(\n    inputs,\n    unnormalized_widths,\n    unnormalized_heights,\n    unnormalized_derivatives,\n    inverse=False,\n    tails=None,\n    tail_bound=1.0,\n    min_bin_width=DEFAULT_MIN_BIN_WIDTH,\n    min_bin_height=DEFAULT_MIN_BIN_HEIGHT,\n    min_derivative=DEFAULT_MIN_DERIVATIVE,\n):\n    if tails is None:\n        spline_fn = rational_quadratic_spline\n        spline_kwargs = {}\n    else:\n        spline_fn = unconstrained_rational_quadratic_spline\n        spline_kwargs = {\"tails\": tails, \"tail_bound\": tail_bound}\n\n    outputs, logabsdet = spline_fn(\n        inputs=inputs,\n        unnormalized_widths=unnormalized_widths,\n        unnormalized_heights=unnormalized_heights,\n        unnormalized_derivatives=unnormalized_derivatives,\n        inverse=inverse,\n        min_bin_width=min_bin_width,\n        min_bin_height=min_bin_height,\n        min_derivative=min_derivative,\n        **spline_kwargs,\n    )\n    return outputs, logabsdet\n\n\ndef searchsorted(bin_locations, inputs, eps=1e-6):\n    bin_locations[..., -1] += eps\n    return torch.sum(inputs[..., None] >= bin_locations, dim=-1) - 1\n\n\ndef unconstrained_rational_quadratic_spline(\n    inputs,\n    unnormalized_widths,\n    unnormalized_heights,\n    unnormalized_derivatives,\n    inverse=False,\n    tails=\"linear\",\n    tail_bound=1.0,\n    min_bin_width=DEFAULT_MIN_BIN_WIDTH,\n    min_bin_height=DEFAULT_MIN_BIN_HEIGHT,\n    min_derivative=DEFAULT_MIN_DERIVATIVE,\n):\n    inside_interval_mask = (inputs >= -tail_bound) & (inputs <= tail_bound)\n    outside_interval_mask = ~inside_interval_mask\n\n    outputs = torch.zeros_like(inputs)\n    logabsdet = torch.zeros_like(inputs)\n\n    if tails == \"linear\":\n        unnormalized_derivatives = F.pad(unnormalized_derivatives, pad=(1, 1))\n        constant = np.log(np.exp(1 - min_derivative) - 1)\n        unnormalized_derivatives[..., 0] = constant\n        unnormalized_derivatives[..., -1] = constant\n\n        outputs[outside_interval_mask] = inputs[outside_interval_mask]\n        logabsdet[outside_interval_mask] = 0\n    else:\n        raise RuntimeError(\"{} tails are not implemented.\".format(tails))\n\n    outputs[inside_interval_mask], logabsdet[inside_interval_mask] = rational_quadratic_spline(\n        inputs=inputs[inside_interval_mask],\n        unnormalized_widths=unnormalized_widths[inside_interval_mask, :],\n        unnormalized_heights=unnormalized_heights[inside_interval_mask, :],\n        unnormalized_derivatives=unnormalized_derivatives[inside_interval_mask, :],\n        inverse=inverse,\n        left=-tail_bound,\n        right=tail_bound,\n        bottom=-tail_bound,\n        top=tail_bound,\n        min_bin_width=min_bin_width,\n        min_bin_height=min_bin_height,\n        min_derivative=min_derivative,\n    )\n\n    return outputs, logabsdet\n\n\ndef rational_quadratic_spline(\n    inputs,\n    unnormalized_widths,\n    unnormalized_heights,\n    unnormalized_derivatives,\n    inverse=False,\n    left=0.0,\n    right=1.0,\n    bottom=0.0,\n    top=1.0,\n    min_bin_width=DEFAULT_MIN_BIN_WIDTH,\n    min_bin_height=DEFAULT_MIN_BIN_HEIGHT,\n    min_derivative=DEFAULT_MIN_DERIVATIVE,\n):\n    if torch.min(inputs) < left or torch.max(inputs) > right:\n        raise ValueError(\"Input to a transform is not within its domain\")\n\n    num_bins = unnormalized_widths.shape[-1]\n\n    if min_bin_width * num_bins > 1.0:\n        raise ValueError(\"Minimal bin width too large for the number of bins\")\n    if min_bin_height * num_bins > 1.0:\n        raise ValueError(\"Minimal bin height too large for the number of bins\")\n\n    widths = F.softmax(unnormalized_widths, dim=-1)\n    widths = min_bin_width + (1 - min_bin_width * num_bins) * widths\n    cumwidths = torch.cumsum(widths, dim=-1)\n    cumwidths = F.pad(cumwidths, pad=(1, 0), mode=\"constant\", value=0.0)\n    cumwidths = (right - left) * cumwidths + left\n    cumwidths[..., 0] = left\n    cumwidths[..., -1] = right\n    widths = cumwidths[..., 1:] - cumwidths[..., :-1]\n\n    derivatives = min_derivative + F.softplus(unnormalized_derivatives)\n\n    heights = F.softmax(unnormalized_heights, dim=-1)\n    heights = min_bin_height + (1 - min_bin_height * num_bins) * heights\n    cumheights = torch.cumsum(heights, dim=-1)\n    cumheights = F.pad(cumheights, pad=(1, 0), mode=\"constant\", value=0.0)\n    cumheights = (top - bottom) * cumheights + bottom\n    cumheights[..., 0] = bottom\n    cumheights[..., -1] = top\n    heights = cumheights[..., 1:] - cumheights[..., :-1]\n\n    if inverse:\n        bin_idx = searchsorted(cumheights, inputs)[..., None]\n    else:\n        bin_idx = searchsorted(cumwidths, inputs)[..., None]\n\n    input_cumwidths = cumwidths.gather(-1, bin_idx)[..., 0]\n    input_bin_widths = widths.gather(-1, bin_idx)[..., 0]\n\n    input_cumheights = cumheights.gather(-1, bin_idx)[..., 0]\n    delta = heights / widths\n    input_delta = delta.gather(-1, bin_idx)[..., 0]\n\n    input_derivatives = derivatives.gather(-1, bin_idx)[..., 0]\n    input_derivatives_plus_one = derivatives[..., 1:].gather(-1, bin_idx)[..., 0]\n\n    input_heights = heights.gather(-1, bin_idx)[..., 0]\n\n    if inverse:\n        a = (inputs - input_cumheights) * (\n            input_derivatives + input_derivatives_plus_one - 2 * input_delta\n        ) + input_heights * (input_delta - input_derivatives)\n        b = input_heights * input_derivatives - (inputs - input_cumheights) * (\n            input_derivatives + input_derivatives_plus_one - 2 * input_delta\n        )\n        c = -input_delta * (inputs - input_cumheights)\n\n        discriminant = b.pow(2) - 4 * a * c\n        assert (discriminant >= 0).all()\n\n        root = (2 * c) / (-b - torch.sqrt(discriminant))\n        outputs = root * input_bin_widths + input_cumwidths\n\n        theta_one_minus_theta = root * (1 - root)\n        denominator = input_delta + (\n            (input_derivatives + input_derivatives_plus_one - 2 * input_delta) * theta_one_minus_theta\n        )\n        derivative_numerator = input_delta.pow(2) * (\n            input_derivatives_plus_one * root.pow(2)\n            + 2 * input_delta * theta_one_minus_theta\n            + input_derivatives * (1 - root).pow(2)\n        )\n        logabsdet = torch.log(derivative_numerator) - 2 * torch.log(denominator)\n\n        return outputs, -logabsdet\n    else:\n        theta = (inputs - input_cumwidths) / input_bin_widths\n        theta_one_minus_theta = theta * (1 - theta)\n\n        numerator = input_heights * (input_delta * theta.pow(2) + input_derivatives * theta_one_minus_theta)\n        denominator = input_delta + (\n            (input_derivatives + input_derivatives_plus_one - 2 * input_delta) * theta_one_minus_theta\n        )\n        outputs = input_cumheights + numerator / denominator\n\n        derivative_numerator = input_delta.pow(2) * (\n            input_derivatives_plus_one * theta.pow(2)\n            + 2 * input_delta * theta_one_minus_theta\n            + input_derivatives * (1 - theta).pow(2)\n        )\n        logabsdet = torch.log(derivative_numerator) - 2 * torch.log(denominator)\n\n        return outputs, logabsdet\n"
  },
  {
    "path": "TTS/tts/models/__init__.py",
    "content": "from typing import Dict, List, Union\n\nfrom TTS.utils.generic_utils import find_module\n\n\ndef setup_model(config: \"Coqpit\", samples: Union[List[List], List[Dict]] = None) -> \"BaseTTS\":\n    print(\" > Using model: {}\".format(config.model))\n    # fetch the right model implementation.\n    if \"base_model\" in config and config[\"base_model\"] is not None:\n        MyModel = find_module(\"TTS.tts.models\", config.base_model.lower())\n    else:\n        MyModel = find_module(\"TTS.tts.models\", config.model.lower())\n    model = MyModel.init_from_config(config, samples)\n    return model\n"
  },
  {
    "path": "TTS/tts/models/align_tts.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import Dict, List, Union\n\nimport torch\nfrom coqpit import Coqpit\nfrom torch import nn\n\nfrom TTS.tts.layers.align_tts.mdn import MDNBlock\nfrom TTS.tts.layers.feed_forward.decoder import Decoder\nfrom TTS.tts.layers.feed_forward.duration_predictor import DurationPredictor\nfrom TTS.tts.layers.feed_forward.encoder import Encoder\nfrom TTS.tts.layers.generic.pos_encoding import PositionalEncoding\nfrom TTS.tts.models.base_tts import BaseTTS\nfrom TTS.tts.utils.helpers import generate_path, maximum_path, sequence_mask\nfrom TTS.tts.utils.speakers import SpeakerManager\nfrom TTS.tts.utils.text.tokenizer import TTSTokenizer\nfrom TTS.tts.utils.visual import plot_alignment, plot_spectrogram\nfrom TTS.utils.io import load_fsspec\n\n\n@dataclass\nclass AlignTTSArgs(Coqpit):\n    \"\"\"\n    Args:\n        num_chars (int):\n            number of unique input to characters\n        out_channels (int):\n            number of output tensor channels. It is equal to the expected spectrogram size.\n        hidden_channels (int):\n            number of channels in all the model layers.\n        hidden_channels_ffn (int):\n            number of channels in transformer's conv layers.\n        hidden_channels_dp (int):\n            number of channels in duration predictor network.\n        num_heads (int):\n            number of attention heads in transformer networks.\n        num_transformer_layers (int):\n            number of layers in encoder and decoder transformer blocks.\n        dropout_p (int):\n            dropout rate in transformer layers.\n        length_scale (int, optional):\n            coefficient to set the speech speed. <1 slower, >1 faster. Defaults to 1.\n        num_speakers (int, optional):\n            number of speakers for multi-speaker training. Defaults to 0.\n        external_c (bool, optional):\n            enable external speaker embeddings. Defaults to False.\n        c_in_channels (int, optional):\n            number of channels in speaker embedding vectors. Defaults to 0.\n    \"\"\"\n\n    num_chars: int = None\n    out_channels: int = 80\n    hidden_channels: int = 256\n    hidden_channels_dp: int = 256\n    encoder_type: str = \"fftransformer\"\n    encoder_params: dict = field(\n        default_factory=lambda: {\"hidden_channels_ffn\": 1024, \"num_heads\": 2, \"num_layers\": 6, \"dropout_p\": 0.1}\n    )\n    decoder_type: str = \"fftransformer\"\n    decoder_params: dict = field(\n        default_factory=lambda: {\"hidden_channels_ffn\": 1024, \"num_heads\": 2, \"num_layers\": 6, \"dropout_p\": 0.1}\n    )\n    length_scale: float = 1.0\n    num_speakers: int = 0\n    use_speaker_embedding: bool = False\n    use_d_vector_file: bool = False\n    d_vector_dim: int = 0\n\n\nclass AlignTTS(BaseTTS):\n    \"\"\"AlignTTS with modified duration predictor.\n    https://arxiv.org/pdf/2003.01950.pdf\n\n    Encoder -> DurationPredictor -> Decoder\n\n    Check :class:`AlignTTSArgs` for the class arguments.\n\n    Paper Abstract:\n        Targeting at both high efficiency and performance, we propose AlignTTS to predict the\n        mel-spectrum in parallel. AlignTTS is based on a Feed-Forward Transformer which generates mel-spectrum from a\n        sequence of characters, and the duration of each character is determined by a duration predictor.Instead of\n        adopting the attention mechanism in Transformer TTS to align text to mel-spectrum, the alignment loss is presented\n        to consider all possible alignments in training by use of dynamic programming. Experiments on the LJSpeech dataset s\n        how that our model achieves not only state-of-the-art performance which outperforms Transformer TTS by 0.03 in mean\n        option score (MOS), but also a high efficiency which is more than 50 times faster than real-time.\n\n    Note:\n        Original model uses a separate character embedding layer for duration predictor. However, it causes the\n        duration predictor to overfit and prevents learning higher level interactions among characters. Therefore,\n        we predict durations based on encoder outputs which has higher level information about input characters. This\n        enables training without phases as in the original paper.\n\n        Original model uses Transormers in encoder and decoder layers. However, here you can set the architecture\n        differently based on your requirements using ```encoder_type``` and ```decoder_type``` parameters.\n\n    Examples:\n        >>> from TTS.tts.configs.align_tts_config import AlignTTSConfig\n        >>> config = AlignTTSConfig()\n        >>> model = AlignTTS(config)\n\n    \"\"\"\n\n    # pylint: disable=dangerous-default-value\n\n    def __init__(\n        self,\n        config: \"AlignTTSConfig\",\n        ap: \"AudioProcessor\" = None,\n        tokenizer: \"TTSTokenizer\" = None,\n        speaker_manager: SpeakerManager = None,\n    ):\n        super().__init__(config, ap, tokenizer, speaker_manager)\n        self.speaker_manager = speaker_manager\n        self.phase = -1\n        self.length_scale = (\n            float(config.model_args.length_scale)\n            if isinstance(config.model_args.length_scale, int)\n            else config.model_args.length_scale\n        )\n\n        self.emb = nn.Embedding(self.config.model_args.num_chars, self.config.model_args.hidden_channels)\n\n        self.embedded_speaker_dim = 0\n        self.init_multispeaker(config)\n\n        self.pos_encoder = PositionalEncoding(config.model_args.hidden_channels)\n        self.encoder = Encoder(\n            config.model_args.hidden_channels,\n            config.model_args.hidden_channels,\n            config.model_args.encoder_type,\n            config.model_args.encoder_params,\n            self.embedded_speaker_dim,\n        )\n        self.decoder = Decoder(\n            config.model_args.out_channels,\n            config.model_args.hidden_channels,\n            config.model_args.decoder_type,\n            config.model_args.decoder_params,\n        )\n        self.duration_predictor = DurationPredictor(config.model_args.hidden_channels_dp)\n\n        self.mod_layer = nn.Conv1d(config.model_args.hidden_channels, config.model_args.hidden_channels, 1)\n\n        self.mdn_block = MDNBlock(config.model_args.hidden_channels, 2 * config.model_args.out_channels)\n\n        if self.embedded_speaker_dim > 0 and self.embedded_speaker_dim != config.model_args.hidden_channels:\n            self.proj_g = nn.Conv1d(self.embedded_speaker_dim, config.model_args.hidden_channels, 1)\n\n    @staticmethod\n    def compute_log_probs(mu, log_sigma, y):\n        # pylint: disable=protected-access, c-extension-no-member\n        y = y.transpose(1, 2).unsqueeze(1)  # [B, 1, T1, D]\n        mu = mu.transpose(1, 2).unsqueeze(2)  # [B, T2, 1, D]\n        log_sigma = log_sigma.transpose(1, 2).unsqueeze(2)  # [B, T2, 1, D]\n        expanded_y, expanded_mu = torch.broadcast_tensors(y, mu)\n        exponential = -0.5 * torch.mean(\n            torch._C._nn.mse_loss(expanded_y, expanded_mu, 0) / torch.pow(log_sigma.exp(), 2), dim=-1\n        )  # B, L, T\n        logp = exponential - 0.5 * log_sigma.mean(dim=-1)\n        return logp\n\n    def compute_align_path(self, mu, log_sigma, y, x_mask, y_mask):\n        # find the max alignment path\n        attn_mask = torch.unsqueeze(x_mask, -1) * torch.unsqueeze(y_mask, 2)\n        log_p = self.compute_log_probs(mu, log_sigma, y)\n        # [B, T_en, T_dec]\n        attn = maximum_path(log_p, attn_mask.squeeze(1)).unsqueeze(1)\n        dr_mas = torch.sum(attn, -1)\n        return dr_mas.squeeze(1), log_p\n\n    @staticmethod\n    def generate_attn(dr, x_mask, y_mask=None):\n        # compute decode mask from the durations\n        if y_mask is None:\n            y_lengths = dr.sum(1).long()\n            y_lengths[y_lengths < 1] = 1\n            y_mask = torch.unsqueeze(sequence_mask(y_lengths, None), 1).to(dr.dtype)\n        attn_mask = torch.unsqueeze(x_mask, -1) * torch.unsqueeze(y_mask, 2)\n        attn = generate_path(dr, attn_mask.squeeze(1)).to(dr.dtype)\n        return attn\n\n    def expand_encoder_outputs(self, en, dr, x_mask, y_mask):\n        \"\"\"Generate attention alignment map from durations and\n        expand encoder outputs\n\n        Examples::\n            - encoder output: [a,b,c,d]\n            - durations: [1, 3, 2, 1]\n\n            - expanded: [a, b, b, b, c, c, d]\n            - attention map: [[0, 0, 0, 0, 0, 0, 1],\n                             [0, 0, 0, 0, 1, 1, 0],\n                             [0, 1, 1, 1, 0, 0, 0],\n                             [1, 0, 0, 0, 0, 0, 0]]\n        \"\"\"\n        attn = self.generate_attn(dr, x_mask, y_mask)\n        o_en_ex = torch.matmul(attn.squeeze(1).transpose(1, 2), en.transpose(1, 2)).transpose(1, 2)\n        return o_en_ex, attn\n\n    def format_durations(self, o_dr_log, x_mask):\n        o_dr = (torch.exp(o_dr_log) - 1) * x_mask * self.length_scale\n        o_dr[o_dr < 1] = 1.0\n        o_dr = torch.round(o_dr)\n        return o_dr\n\n    @staticmethod\n    def _concat_speaker_embedding(o_en, g):\n        g_exp = g.expand(-1, -1, o_en.size(-1))  # [B, C, T_en]\n        o_en = torch.cat([o_en, g_exp], 1)\n        return o_en\n\n    def _sum_speaker_embedding(self, x, g):\n        # project g to decoder dim.\n        if hasattr(self, \"proj_g\"):\n            g = self.proj_g(g)\n\n        return x + g\n\n    def _forward_encoder(self, x, x_lengths, g=None):\n        if hasattr(self, \"emb_g\"):\n            g = nn.functional.normalize(self.speaker_embedding(g))  # [B, C, 1]\n\n        if g is not None:\n            g = g.unsqueeze(-1)\n\n        # [B, T, C]\n        x_emb = self.emb(x)\n        # [B, C, T]\n        x_emb = torch.transpose(x_emb, 1, -1)\n\n        # compute sequence masks\n        x_mask = torch.unsqueeze(sequence_mask(x_lengths, x.shape[1]), 1).to(x.dtype)\n\n        # encoder pass\n        o_en = self.encoder(x_emb, x_mask)\n\n        # speaker conditioning for duration predictor\n        if g is not None:\n            o_en_dp = self._concat_speaker_embedding(o_en, g)\n        else:\n            o_en_dp = o_en\n        return o_en, o_en_dp, x_mask, g\n\n    def _forward_decoder(self, o_en, o_en_dp, dr, x_mask, y_lengths, g):\n        y_mask = torch.unsqueeze(sequence_mask(y_lengths, None), 1).to(o_en_dp.dtype)\n        # expand o_en with durations\n        o_en_ex, attn = self.expand_encoder_outputs(o_en, dr, x_mask, y_mask)\n        # positional encoding\n        if hasattr(self, \"pos_encoder\"):\n            o_en_ex = self.pos_encoder(o_en_ex, y_mask)\n        # speaker embedding\n        if g is not None:\n            o_en_ex = self._sum_speaker_embedding(o_en_ex, g)\n        # decoder pass\n        o_de = self.decoder(o_en_ex, y_mask, g=g)\n        return o_de, attn.transpose(1, 2)\n\n    def _forward_mdn(self, o_en, y, y_lengths, x_mask):\n        # MAS potentials and alignment\n        mu, log_sigma = self.mdn_block(o_en)\n        y_mask = torch.unsqueeze(sequence_mask(y_lengths, None), 1).to(o_en.dtype)\n        dr_mas, logp = self.compute_align_path(mu, log_sigma, y, x_mask, y_mask)\n        return dr_mas, mu, log_sigma, logp\n\n    def forward(\n        self, x, x_lengths, y, y_lengths, aux_input={\"d_vectors\": None}, phase=None\n    ):  # pylint: disable=unused-argument\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, T_max]`\n            - x_lengths: :math:`[B]`\n            - y_lengths: :math:`[B]`\n            - dr: :math:`[B, T_max]`\n            - g: :math:`[B, C]`\n        \"\"\"\n        y = y.transpose(1, 2)\n        g = aux_input[\"d_vectors\"] if \"d_vectors\" in aux_input else None\n        o_de, o_dr_log, dr_mas_log, attn, mu, log_sigma, logp = None, None, None, None, None, None, None\n        if phase == 0:\n            # train encoder and MDN\n            o_en, o_en_dp, x_mask, g = self._forward_encoder(x, x_lengths, g)\n            dr_mas, mu, log_sigma, logp = self._forward_mdn(o_en, y, y_lengths, x_mask)\n            y_mask = torch.unsqueeze(sequence_mask(y_lengths, None), 1).to(o_en_dp.dtype)\n            attn = self.generate_attn(dr_mas, x_mask, y_mask)\n        elif phase == 1:\n            # train decoder\n            o_en, o_en_dp, x_mask, g = self._forward_encoder(x, x_lengths, g)\n            dr_mas, _, _, _ = self._forward_mdn(o_en, y, y_lengths, x_mask)\n            o_de, attn = self._forward_decoder(o_en.detach(), o_en_dp.detach(), dr_mas.detach(), x_mask, y_lengths, g=g)\n        elif phase == 2:\n            # train the whole except duration predictor\n            o_en, o_en_dp, x_mask, g = self._forward_encoder(x, x_lengths, g)\n            dr_mas, mu, log_sigma, logp = self._forward_mdn(o_en, y, y_lengths, x_mask)\n            o_de, attn = self._forward_decoder(o_en, o_en_dp, dr_mas, x_mask, y_lengths, g=g)\n        elif phase == 3:\n            # train duration predictor\n            o_en, o_en_dp, x_mask, g = self._forward_encoder(x, x_lengths, g)\n            o_dr_log = self.duration_predictor(x, x_mask)\n            dr_mas, mu, log_sigma, logp = self._forward_mdn(o_en, y, y_lengths, x_mask)\n            o_de, attn = self._forward_decoder(o_en, o_en_dp, dr_mas, x_mask, y_lengths, g=g)\n            o_dr_log = o_dr_log.squeeze(1)\n        else:\n            o_en, o_en_dp, x_mask, g = self._forward_encoder(x, x_lengths, g)\n            o_dr_log = self.duration_predictor(o_en_dp.detach(), x_mask)\n            dr_mas, mu, log_sigma, logp = self._forward_mdn(o_en, y, y_lengths, x_mask)\n            o_de, attn = self._forward_decoder(o_en, o_en_dp, dr_mas, x_mask, y_lengths, g=g)\n            o_dr_log = o_dr_log.squeeze(1)\n        dr_mas_log = torch.log(dr_mas + 1).squeeze(1)\n        outputs = {\n            \"model_outputs\": o_de.transpose(1, 2),\n            \"alignments\": attn,\n            \"durations_log\": o_dr_log,\n            \"durations_mas_log\": dr_mas_log,\n            \"mu\": mu,\n            \"log_sigma\": log_sigma,\n            \"logp\": logp,\n        }\n        return outputs\n\n    @torch.no_grad()\n    def inference(self, x, aux_input={\"d_vectors\": None}):  # pylint: disable=unused-argument\n        \"\"\"\n        Shapes:\n            - x: :math:`[B, T_max]`\n            - x_lengths: :math:`[B]`\n            - g: :math:`[B, C]`\n        \"\"\"\n        g = aux_input[\"d_vectors\"] if \"d_vectors\" in aux_input else None\n        x_lengths = torch.tensor(x.shape[1:2]).to(x.device)\n        # pad input to prevent dropping the last word\n        # x = torch.nn.functional.pad(x, pad=(0, 5), mode='constant', value=0)\n        o_en, o_en_dp, x_mask, g = self._forward_encoder(x, x_lengths, g)\n        # o_dr_log = self.duration_predictor(x, x_mask)\n        o_dr_log = self.duration_predictor(o_en_dp, x_mask)\n        # duration predictor pass\n        o_dr = self.format_durations(o_dr_log, x_mask).squeeze(1)\n        y_lengths = o_dr.sum(1)\n        o_de, attn = self._forward_decoder(o_en, o_en_dp, o_dr, x_mask, y_lengths, g=g)\n        outputs = {\"model_outputs\": o_de.transpose(1, 2), \"alignments\": attn}\n        return outputs\n\n    def train_step(self, batch: dict, criterion: nn.Module):\n        text_input = batch[\"text_input\"]\n        text_lengths = batch[\"text_lengths\"]\n        mel_input = batch[\"mel_input\"]\n        mel_lengths = batch[\"mel_lengths\"]\n        d_vectors = batch[\"d_vectors\"]\n        speaker_ids = batch[\"speaker_ids\"]\n\n        aux_input = {\"d_vectors\": d_vectors, \"speaker_ids\": speaker_ids}\n        outputs = self.forward(text_input, text_lengths, mel_input, mel_lengths, aux_input, self.phase)\n        loss_dict = criterion(\n            outputs[\"logp\"],\n            outputs[\"model_outputs\"],\n            mel_input,\n            mel_lengths,\n            outputs[\"durations_log\"],\n            outputs[\"durations_mas_log\"],\n            text_lengths,\n            phase=self.phase,\n        )\n\n        return outputs, loss_dict\n\n    def _create_logs(self, batch, outputs, ap):  # pylint: disable=no-self-use\n        model_outputs = outputs[\"model_outputs\"]\n        alignments = outputs[\"alignments\"]\n        mel_input = batch[\"mel_input\"]\n\n        pred_spec = model_outputs[0].data.cpu().numpy()\n        gt_spec = mel_input[0].data.cpu().numpy()\n        align_img = alignments[0].data.cpu().numpy()\n\n        figures = {\n            \"prediction\": plot_spectrogram(pred_spec, ap, output_fig=False),\n            \"ground_truth\": plot_spectrogram(gt_spec, ap, output_fig=False),\n            \"alignment\": plot_alignment(align_img, output_fig=False),\n        }\n\n        # Sample audio\n        train_audio = ap.inv_melspectrogram(pred_spec.T)\n        return figures, {\"audio\": train_audio}\n\n    def train_log(\n        self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int\n    ) -> None:  # pylint: disable=no-self-use\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.train_figures(steps, figures)\n        logger.train_audios(steps, audios, self.ap.sample_rate)\n\n    def eval_step(self, batch: dict, criterion: nn.Module):\n        return self.train_step(batch, criterion)\n\n    def eval_log(self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int) -> None:\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.eval_figures(steps, figures)\n        logger.eval_audios(steps, audios, self.ap.sample_rate)\n\n    def load_checkpoint(\n        self, config, checkpoint_path, eval=False, cache=False\n    ):  # pylint: disable=unused-argument, redefined-builtin\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), cache=cache)\n        self.load_state_dict(state[\"model\"])\n        if eval:\n            self.eval()\n            assert not self.training\n\n    def get_criterion(self):\n        from TTS.tts.layers.losses import AlignTTSLoss  # pylint: disable=import-outside-toplevel\n\n        return AlignTTSLoss(self.config)\n\n    @staticmethod\n    def _set_phase(config, global_step):\n        \"\"\"Decide AlignTTS training phase\"\"\"\n        if isinstance(config.phase_start_steps, list):\n            vals = [i < global_step for i in config.phase_start_steps]\n            if not True in vals:\n                phase = 0\n            else:\n                phase = (\n                    len(config.phase_start_steps)\n                    - [i < global_step for i in config.phase_start_steps][::-1].index(True)\n                    - 1\n                )\n        else:\n            phase = None\n        return phase\n\n    def on_epoch_start(self, trainer):\n        \"\"\"Set AlignTTS training phase on epoch start.\"\"\"\n        self.phase = self._set_phase(trainer.config, trainer.total_steps_done)\n\n    @staticmethod\n    def init_from_config(config: \"AlignTTSConfig\", samples: Union[List[List], List[Dict]] = None):\n        \"\"\"Initiate model from config\n\n        Args:\n            config (AlignTTSConfig): Model config.\n            samples (Union[List[List], List[Dict]]): Training samples to parse speaker ids for training.\n                Defaults to None.\n        \"\"\"\n        from TTS.utils.audio import AudioProcessor\n\n        ap = AudioProcessor.init_from_config(config)\n        tokenizer, new_config = TTSTokenizer.init_from_config(config)\n        speaker_manager = SpeakerManager.init_from_config(config, samples)\n        return AlignTTS(new_config, ap, tokenizer, speaker_manager)\n"
  },
  {
    "path": "TTS/tts/models/base_tacotron.py",
    "content": "import copy\nfrom abc import abstractmethod\nfrom typing import Dict, Tuple\n\nimport torch\nfrom coqpit import Coqpit\nfrom torch import nn\n\nfrom TTS.tts.layers.losses import TacotronLoss\nfrom TTS.tts.models.base_tts import BaseTTS\nfrom TTS.tts.utils.helpers import sequence_mask\nfrom TTS.tts.utils.speakers import SpeakerManager\nfrom TTS.tts.utils.synthesis import synthesis\nfrom TTS.tts.utils.text.tokenizer import TTSTokenizer\nfrom TTS.tts.utils.visual import plot_alignment, plot_spectrogram\nfrom TTS.utils.generic_utils import format_aux_input\nfrom TTS.utils.io import load_fsspec\nfrom TTS.utils.training import gradual_training_scheduler\n\n\nclass BaseTacotron(BaseTTS):\n    \"\"\"Base class shared by Tacotron and Tacotron2\"\"\"\n\n    def __init__(\n        self,\n        config: \"TacotronConfig\",\n        ap: \"AudioProcessor\",\n        tokenizer: \"TTSTokenizer\",\n        speaker_manager: SpeakerManager = None,\n    ):\n        super().__init__(config, ap, tokenizer, speaker_manager)\n\n        # pass all config fields as class attributes\n        for key in config:\n            setattr(self, key, config[key])\n\n        # layers\n        self.embedding = None\n        self.encoder = None\n        self.decoder = None\n        self.postnet = None\n\n        # init tensors\n        self.embedded_speakers = None\n        self.embedded_speakers_projected = None\n\n        # global style token\n        if self.gst and self.use_gst:\n            self.decoder_in_features += self.gst.gst_embedding_dim  # add gst embedding dim\n            self.gst_layer = None\n\n        # Capacitron\n        if self.capacitron_vae and self.use_capacitron_vae:\n            self.decoder_in_features += self.capacitron_vae.capacitron_VAE_embedding_dim  # add capacitron embedding dim\n            self.capacitron_vae_layer = None\n\n        # additional layers\n        self.decoder_backward = None\n        self.coarse_decoder = None\n\n    @staticmethod\n    def _format_aux_input(aux_input: Dict) -> Dict:\n        \"\"\"Set missing fields to their default values\"\"\"\n        if aux_input:\n            return format_aux_input({\"d_vectors\": None, \"speaker_ids\": None}, aux_input)\n        return None\n\n    #############################\n    # INIT FUNCTIONS\n    #############################\n\n    def _init_backward_decoder(self):\n        \"\"\"Init the backward decoder for Forward-Backward decoding.\"\"\"\n        self.decoder_backward = copy.deepcopy(self.decoder)\n\n    def _init_coarse_decoder(self):\n        \"\"\"Init the coarse decoder for Double-Decoder Consistency.\"\"\"\n        self.coarse_decoder = copy.deepcopy(self.decoder)\n        self.coarse_decoder.r_init = self.ddc_r\n        self.coarse_decoder.set_r(self.ddc_r)\n\n    #############################\n    # CORE FUNCTIONS\n    #############################\n\n    @abstractmethod\n    def forward(self):\n        pass\n\n    @abstractmethod\n    def inference(self):\n        pass\n\n    def load_checkpoint(\n        self, config, checkpoint_path, eval=False, cache=False\n    ):  # pylint: disable=unused-argument, redefined-builtin\n        \"\"\"Load model checkpoint and set up internals.\n\n        Args:\n            config (Coqpi): model configuration.\n            checkpoint_path (str): path to checkpoint file.\n            eval (bool, optional): whether to load model for evaluation.\n            cache (bool, optional): If True, cache the file locally for subsequent calls. It is cached under `get_user_data_dir()/tts_cache`. Defaults to False.\n        \"\"\"\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), cache=cache)\n        self.load_state_dict(state[\"model\"])\n        # TODO: set r in run-time by taking it from the new config\n        if \"r\" in state:\n            # set r from the state (for compatibility with older checkpoints)\n            self.decoder.set_r(state[\"r\"])\n        elif \"config\" in state:\n            # set r from config used at training time (for inference)\n            self.decoder.set_r(state[\"config\"][\"r\"])\n        else:\n            # set r from the new config (for new-models)\n            self.decoder.set_r(config.r)\n        if eval:\n            self.eval()\n            print(f\" > Model's reduction rate `r` is set to: {self.decoder.r}\")\n            assert not self.training\n\n    def get_criterion(self) -> nn.Module:\n        \"\"\"Get the model criterion used in training.\"\"\"\n        return TacotronLoss(self.config)\n\n    @staticmethod\n    def init_from_config(config: Coqpit):\n        \"\"\"Initialize model from config.\"\"\"\n        from TTS.utils.audio import AudioProcessor\n\n        ap = AudioProcessor.init_from_config(config)\n        tokenizer = TTSTokenizer.init_from_config(config)\n        speaker_manager = SpeakerManager.init_from_config(config)\n        return BaseTacotron(config, ap, tokenizer, speaker_manager)\n\n    ##########################\n    # TEST AND LOG FUNCTIONS #\n    ##########################\n\n    def test_run(self, assets: Dict) -> Tuple[Dict, Dict]:\n        \"\"\"Generic test run for `tts` models used by `Trainer`.\n\n        You can override this for a different behaviour.\n\n        Args:\n            assets (dict): A dict of training assets. For `tts` models, it must include `{'audio_processor': ap}`.\n\n        Returns:\n            Tuple[Dict, Dict]: Test figures and audios to be projected to Tensorboard.\n        \"\"\"\n        print(\" | > Synthesizing test sentences.\")\n        test_audios = {}\n        test_figures = {}\n        test_sentences = self.config.test_sentences\n        aux_inputs = self._get_test_aux_input()\n        for idx, sen in enumerate(test_sentences):\n            outputs_dict = synthesis(\n                self,\n                sen,\n                self.config,\n                \"cuda\" in str(next(self.parameters()).device),\n                speaker_id=aux_inputs[\"speaker_id\"],\n                d_vector=aux_inputs[\"d_vector\"],\n                style_wav=aux_inputs[\"style_wav\"],\n                use_griffin_lim=True,\n                do_trim_silence=False,\n            )\n            test_audios[\"{}-audio\".format(idx)] = outputs_dict[\"wav\"]\n            test_figures[\"{}-prediction\".format(idx)] = plot_spectrogram(\n                outputs_dict[\"outputs\"][\"model_outputs\"], self.ap, output_fig=False\n            )\n            test_figures[\"{}-alignment\".format(idx)] = plot_alignment(\n                outputs_dict[\"outputs\"][\"alignments\"], output_fig=False\n            )\n        return {\"figures\": test_figures, \"audios\": test_audios}\n\n    def test_log(\n        self, outputs: dict, logger: \"Logger\", assets: dict, steps: int  # pylint: disable=unused-argument\n    ) -> None:\n        logger.test_audios(steps, outputs[\"audios\"], self.ap.sample_rate)\n        logger.test_figures(steps, outputs[\"figures\"])\n\n    #############################\n    # COMMON COMPUTE FUNCTIONS\n    #############################\n\n    def compute_masks(self, text_lengths, mel_lengths):\n        \"\"\"Compute masks  against sequence paddings.\"\"\"\n        # B x T_in_max (boolean)\n        input_mask = sequence_mask(text_lengths)\n        output_mask = None\n        if mel_lengths is not None:\n            max_len = mel_lengths.max()\n            r = self.decoder.r\n            max_len = max_len + (r - (max_len % r)) if max_len % r > 0 else max_len\n            output_mask = sequence_mask(mel_lengths, max_len=max_len)\n        return input_mask, output_mask\n\n    def _backward_pass(self, mel_specs, encoder_outputs, mask):\n        \"\"\"Run backwards decoder\"\"\"\n        decoder_outputs_b, alignments_b, _ = self.decoder_backward(\n            encoder_outputs, torch.flip(mel_specs, dims=(1,)), mask\n        )\n        decoder_outputs_b = decoder_outputs_b.transpose(1, 2).contiguous()\n        return decoder_outputs_b, alignments_b\n\n    def _coarse_decoder_pass(self, mel_specs, encoder_outputs, alignments, input_mask):\n        \"\"\"Double Decoder Consistency\"\"\"\n        T = mel_specs.shape[1]\n        if T % self.coarse_decoder.r > 0:\n            padding_size = self.coarse_decoder.r - (T % self.coarse_decoder.r)\n            mel_specs = torch.nn.functional.pad(mel_specs, (0, 0, 0, padding_size, 0, 0))\n        decoder_outputs_backward, alignments_backward, _ = self.coarse_decoder(\n            encoder_outputs.detach(), mel_specs, input_mask\n        )\n        # scale_factor = self.decoder.r_init / self.decoder.r\n        alignments_backward = torch.nn.functional.interpolate(\n            alignments_backward.transpose(1, 2),\n            size=alignments.shape[1],\n            mode=\"nearest\",\n        ).transpose(1, 2)\n        decoder_outputs_backward = decoder_outputs_backward.transpose(1, 2)\n        decoder_outputs_backward = decoder_outputs_backward[:, :T, :]\n        return decoder_outputs_backward, alignments_backward\n\n    #############################\n    # EMBEDDING FUNCTIONS\n    #############################\n\n    def compute_gst(self, inputs, style_input, speaker_embedding=None):\n        \"\"\"Compute global style token\"\"\"\n        if isinstance(style_input, dict):\n            # multiply each style token with a weight\n            query = torch.zeros(1, 1, self.gst.gst_embedding_dim // 2).type_as(inputs)\n            if speaker_embedding is not None:\n                query = torch.cat([query, speaker_embedding.reshape(1, 1, -1)], dim=-1)\n\n            _GST = torch.tanh(self.gst_layer.style_token_layer.style_tokens)\n            gst_outputs = torch.zeros(1, 1, self.gst.gst_embedding_dim).type_as(inputs)\n            for k_token, v_amplifier in style_input.items():\n                key = _GST[int(k_token)].unsqueeze(0).expand(1, -1, -1)\n                gst_outputs_att = self.gst_layer.style_token_layer.attention(query, key)\n                gst_outputs = gst_outputs + gst_outputs_att * v_amplifier\n        elif style_input is None:\n            # ignore style token and return zero tensor\n            gst_outputs = torch.zeros(1, 1, self.gst.gst_embedding_dim).type_as(inputs)\n        else:\n            # compute style tokens\n            gst_outputs = self.gst_layer(style_input, speaker_embedding)  # pylint: disable=not-callable\n        inputs = self._concat_speaker_embedding(inputs, gst_outputs)\n        return inputs\n\n    def compute_capacitron_VAE_embedding(self, inputs, reference_mel_info, text_info=None, speaker_embedding=None):\n        \"\"\"Capacitron Variational Autoencoder\"\"\"\n        (\n            VAE_outputs,\n            posterior_distribution,\n            prior_distribution,\n            capacitron_beta,\n        ) = self.capacitron_vae_layer(\n            reference_mel_info,\n            text_info,\n            speaker_embedding,  # pylint: disable=not-callable\n        )\n\n        VAE_outputs = VAE_outputs.to(inputs.device)\n        encoder_output = self._concat_speaker_embedding(\n            inputs, VAE_outputs\n        )  # concatenate to the output of the basic tacotron encoder\n        return (\n            encoder_output,\n            posterior_distribution,\n            prior_distribution,\n            capacitron_beta,\n        )\n\n    @staticmethod\n    def _add_speaker_embedding(outputs, embedded_speakers):\n        embedded_speakers_ = embedded_speakers.expand(outputs.size(0), outputs.size(1), -1)\n        outputs = outputs + embedded_speakers_\n        return outputs\n\n    @staticmethod\n    def _concat_speaker_embedding(outputs, embedded_speakers):\n        embedded_speakers_ = embedded_speakers.expand(outputs.size(0), outputs.size(1), -1)\n        outputs = torch.cat([outputs, embedded_speakers_], dim=-1)\n        return outputs\n\n    #############################\n    # CALLBACKS\n    #############################\n\n    def on_epoch_start(self, trainer):\n        \"\"\"Callback for setting values wrt gradual training schedule.\n\n        Args:\n            trainer (TrainerTTS): TTS trainer object that is used to train this model.\n        \"\"\"\n        if self.gradual_training:\n            r, trainer.config.batch_size = gradual_training_scheduler(trainer.total_steps_done, trainer.config)\n            trainer.config.r = r\n            self.decoder.set_r(r)\n            if trainer.config.bidirectional_decoder:\n                trainer.model.decoder_backward.set_r(r)\n            print(f\"\\n > Number of output frames: {self.decoder.r}\")\n"
  },
  {
    "path": "TTS/tts/models/base_tts.py",
    "content": "import os\nimport random\nfrom typing import Dict, List, Tuple, Union\n\nimport torch\nimport torch.distributed as dist\nfrom coqpit import Coqpit\nfrom torch import nn\nfrom torch.utils.data import DataLoader\nfrom torch.utils.data.sampler import WeightedRandomSampler\nfrom trainer.torch import DistributedSampler, DistributedSamplerWrapper\n\nfrom TTS.model import BaseTrainerModel\nfrom TTS.tts.datasets.dataset import TTSDataset\nfrom TTS.tts.utils.data import get_length_balancer_weights\nfrom TTS.tts.utils.languages import LanguageManager, get_language_balancer_weights\nfrom TTS.tts.utils.speakers import SpeakerManager, get_speaker_balancer_weights, get_speaker_manager\nfrom TTS.tts.utils.synthesis import synthesis\nfrom TTS.tts.utils.visual import plot_alignment, plot_spectrogram\n\n# pylint: skip-file\n\n\nclass BaseTTS(BaseTrainerModel):\n    \"\"\"Base `tts` class. Every new `tts` model must inherit this.\n\n    It defines common `tts` specific functions on top of `Model` implementation.\n    \"\"\"\n\n    MODEL_TYPE = \"tts\"\n\n    def __init__(\n        self,\n        config: Coqpit,\n        ap: \"AudioProcessor\",\n        tokenizer: \"TTSTokenizer\",\n        speaker_manager: SpeakerManager = None,\n        language_manager: LanguageManager = None,\n    ):\n        super().__init__()\n        self.config = config\n        self.ap = ap\n        self.tokenizer = tokenizer\n        self.speaker_manager = speaker_manager\n        self.language_manager = language_manager\n        self._set_model_args(config)\n\n    def _set_model_args(self, config: Coqpit):\n        \"\"\"Setup model args based on the config type (`ModelConfig` or `ModelArgs`).\n\n        `ModelArgs` has all the fields reuqired to initialize the model architecture.\n\n        `ModelConfig` has all the fields required for training, inference and containes `ModelArgs`.\n\n        If the config is for training with a name like \"*Config\", then the model args are embeded in the\n        config.model_args\n\n        If the config is for the model with a name like \"*Args\", then we assign the directly.\n        \"\"\"\n        # don't use isintance not to import recursively\n        if \"Config\" in config.__class__.__name__:\n            config_num_chars = (\n                self.config.model_args.num_chars if hasattr(self.config, \"model_args\") else self.config.num_chars\n            )\n            num_chars = config_num_chars if self.tokenizer is None else self.tokenizer.characters.num_chars\n            if \"characters\" in config:\n                self.config.num_chars = num_chars\n                if hasattr(self.config, \"model_args\"):\n                    config.model_args.num_chars = num_chars\n                    self.args = self.config.model_args\n            else:\n                self.config = config\n                self.args = config.model_args\n        elif \"Args\" in config.__class__.__name__:\n            self.args = config\n        else:\n            raise ValueError(\"config must be either a *Config or *Args\")\n\n    def init_multispeaker(self, config: Coqpit, data: List = None):\n        \"\"\"Initialize a speaker embedding layer if needen and define expected embedding channel size for defining\n        `in_channels` size of the connected layers.\n\n        This implementation yields 3 possible outcomes:\n\n        1. If `config.use_speaker_embedding` and `config.use_d_vector_file are False, do nothing.\n        2. If `config.use_d_vector_file` is True, set expected embedding channel size to `config.d_vector_dim` or 512.\n        3. If `config.use_speaker_embedding`, initialize a speaker embedding layer with channel size of\n        `config.d_vector_dim` or 512.\n\n        You can override this function for new models.\n\n        Args:\n            config (Coqpit): Model configuration.\n        \"\"\"\n        # set number of speakers\n        if self.speaker_manager is not None:\n            self.num_speakers = self.speaker_manager.num_speakers\n        elif hasattr(config, \"num_speakers\"):\n            self.num_speakers = config.num_speakers\n\n        # set ultimate speaker embedding size\n        if config.use_speaker_embedding or config.use_d_vector_file:\n            self.embedded_speaker_dim = (\n                config.d_vector_dim if \"d_vector_dim\" in config and config.d_vector_dim is not None else 512\n            )\n        # init speaker embedding layer\n        if config.use_speaker_embedding and not config.use_d_vector_file:\n            print(\" > Init speaker_embedding layer.\")\n            self.speaker_embedding = nn.Embedding(self.num_speakers, self.embedded_speaker_dim)\n            self.speaker_embedding.weight.data.normal_(0, 0.3)\n\n    def get_aux_input(self, **kwargs) -> Dict:\n        \"\"\"Prepare and return `aux_input` used by `forward()`\"\"\"\n        return {\"speaker_id\": None, \"style_wav\": None, \"d_vector\": None, \"language_id\": None}\n\n    def get_aux_input_from_test_sentences(self, sentence_info):\n        if hasattr(self.config, \"model_args\"):\n            config = self.config.model_args\n        else:\n            config = self.config\n\n        # extract speaker and language info\n        text, speaker_name, style_wav, language_name = None, None, None, None\n\n        if isinstance(sentence_info, list):\n            if len(sentence_info) == 1:\n                text = sentence_info[0]\n            elif len(sentence_info) == 2:\n                text, speaker_name = sentence_info\n            elif len(sentence_info) == 3:\n                text, speaker_name, style_wav = sentence_info\n            elif len(sentence_info) == 4:\n                text, speaker_name, style_wav, language_name = sentence_info\n        else:\n            text = sentence_info\n\n        # get speaker  id/d_vector\n        speaker_id, d_vector, language_id = None, None, None\n        if self.speaker_manager is not None:\n            if config.use_d_vector_file:\n                if speaker_name is None:\n                    d_vector = self.speaker_manager.get_random_embedding()\n                else:\n                    d_vector = self.speaker_manager.get_d_vector_by_name(speaker_name)\n            elif config.use_speaker_embedding:\n                if speaker_name is None:\n                    speaker_id = self.speaker_manager.get_random_id()\n                else:\n                    speaker_id = self.speaker_manager.name_to_id[speaker_name]\n\n        # get language id\n        if self.language_manager is not None and config.use_language_embedding and language_name is not None:\n            language_id = self.language_manager.name_to_id[language_name]\n\n        return {\n            \"text\": text,\n            \"speaker_id\": speaker_id,\n            \"style_wav\": style_wav,\n            \"d_vector\": d_vector,\n            \"language_id\": language_id,\n        }\n\n    def format_batch(self, batch: Dict) -> Dict:\n        \"\"\"Generic batch formatting for `TTSDataset`.\n\n        You must override this if you use a custom dataset.\n\n        Args:\n            batch (Dict): [description]\n\n        Returns:\n            Dict: [description]\n        \"\"\"\n        # setup input batch\n        text_input = batch[\"token_id\"]\n        text_lengths = batch[\"token_id_lengths\"]\n        speaker_names = batch[\"speaker_names\"]\n        linear_input = batch[\"linear\"]\n        mel_input = batch[\"mel\"]\n        mel_lengths = batch[\"mel_lengths\"]\n        stop_targets = batch[\"stop_targets\"]\n        item_idx = batch[\"item_idxs\"]\n        d_vectors = batch[\"d_vectors\"]\n        speaker_ids = batch[\"speaker_ids\"]\n        attn_mask = batch[\"attns\"]\n        waveform = batch[\"waveform\"]\n        pitch = batch[\"pitch\"]\n        energy = batch[\"energy\"]\n        language_ids = batch[\"language_ids\"]\n        max_text_length = torch.max(text_lengths.float())\n        max_spec_length = torch.max(mel_lengths.float())\n\n        # compute durations from attention masks\n        durations = None\n        if attn_mask is not None:\n            durations = torch.zeros(attn_mask.shape[0], attn_mask.shape[2])\n            for idx, am in enumerate(attn_mask):\n                # compute raw durations\n                c_idxs = am[:, : text_lengths[idx], : mel_lengths[idx]].max(1)[1]\n                # c_idxs, counts = torch.unique_consecutive(c_idxs, return_counts=True)\n                c_idxs, counts = torch.unique(c_idxs, return_counts=True)\n                dur = torch.ones([text_lengths[idx]]).to(counts.dtype)\n                dur[c_idxs] = counts\n                # smooth the durations and set any 0 duration to 1\n                # by cutting off from the largest duration indeces.\n                extra_frames = dur.sum() - mel_lengths[idx]\n                largest_idxs = torch.argsort(-dur)[:extra_frames]\n                dur[largest_idxs] -= 1\n                assert (\n                    dur.sum() == mel_lengths[idx]\n                ), f\" [!] total duration {dur.sum()} vs spectrogram length {mel_lengths[idx]}\"\n                durations[idx, : text_lengths[idx]] = dur\n\n        # set stop targets wrt reduction factor\n        stop_targets = stop_targets.view(text_input.shape[0], stop_targets.size(1) // self.config.r, -1)\n        stop_targets = (stop_targets.sum(2) > 0.0).unsqueeze(2).float().squeeze(2)\n        stop_target_lengths = torch.divide(mel_lengths, self.config.r).ceil_()\n\n        return {\n            \"text_input\": text_input,\n            \"text_lengths\": text_lengths,\n            \"speaker_names\": speaker_names,\n            \"mel_input\": mel_input,\n            \"mel_lengths\": mel_lengths,\n            \"linear_input\": linear_input,\n            \"stop_targets\": stop_targets,\n            \"stop_target_lengths\": stop_target_lengths,\n            \"attn_mask\": attn_mask,\n            \"durations\": durations,\n            \"speaker_ids\": speaker_ids,\n            \"d_vectors\": d_vectors,\n            \"max_text_length\": float(max_text_length),\n            \"max_spec_length\": float(max_spec_length),\n            \"item_idx\": item_idx,\n            \"waveform\": waveform,\n            \"pitch\": pitch,\n            \"energy\": energy,\n            \"language_ids\": language_ids,\n            \"audio_unique_names\": batch[\"audio_unique_names\"],\n        }\n\n    def get_sampler(self, config: Coqpit, dataset: TTSDataset, num_gpus=1):\n        weights = None\n        data_items = dataset.samples\n\n        if getattr(config, \"use_language_weighted_sampler\", False):\n            alpha = getattr(config, \"language_weighted_sampler_alpha\", 1.0)\n            print(\" > Using Language weighted sampler with alpha:\", alpha)\n            weights = get_language_balancer_weights(data_items) * alpha\n\n        if getattr(config, \"use_speaker_weighted_sampler\", False):\n            alpha = getattr(config, \"speaker_weighted_sampler_alpha\", 1.0)\n            print(\" > Using Speaker weighted sampler with alpha:\", alpha)\n            if weights is not None:\n                weights += get_speaker_balancer_weights(data_items) * alpha\n            else:\n                weights = get_speaker_balancer_weights(data_items) * alpha\n\n        if getattr(config, \"use_length_weighted_sampler\", False):\n            alpha = getattr(config, \"length_weighted_sampler_alpha\", 1.0)\n            print(\" > Using Length weighted sampler with alpha:\", alpha)\n            if weights is not None:\n                weights += get_length_balancer_weights(data_items) * alpha\n            else:\n                weights = get_length_balancer_weights(data_items) * alpha\n\n        if weights is not None:\n            sampler = WeightedRandomSampler(weights, len(weights))\n        else:\n            sampler = None\n\n        # sampler for DDP\n        if sampler is None:\n            sampler = DistributedSampler(dataset) if num_gpus > 1 else None\n        else:  # If a sampler is already defined use this sampler and DDP sampler together\n            sampler = DistributedSamplerWrapper(sampler) if num_gpus > 1 else sampler\n\n        return sampler\n\n    def get_data_loader(\n        self,\n        config: Coqpit,\n        assets: Dict,\n        is_eval: bool,\n        samples: Union[List[Dict], List[List]],\n        verbose: bool,\n        num_gpus: int,\n        rank: int = None,\n    ) -> \"DataLoader\":\n        if is_eval and not config.run_eval:\n            loader = None\n        else:\n            # setup multi-speaker attributes\n            if self.speaker_manager is not None:\n                if hasattr(config, \"model_args\"):\n                    speaker_id_mapping = (\n                        self.speaker_manager.name_to_id if config.model_args.use_speaker_embedding else None\n                    )\n                    d_vector_mapping = self.speaker_manager.embeddings if config.model_args.use_d_vector_file else None\n                    config.use_d_vector_file = config.model_args.use_d_vector_file\n                else:\n                    speaker_id_mapping = self.speaker_manager.name_to_id if config.use_speaker_embedding else None\n                    d_vector_mapping = self.speaker_manager.embeddings if config.use_d_vector_file else None\n            else:\n                speaker_id_mapping = None\n                d_vector_mapping = None\n\n            # setup multi-lingual attributes\n            if self.language_manager is not None:\n                language_id_mapping = self.language_manager.name_to_id if self.args.use_language_embedding else None\n            else:\n                language_id_mapping = None\n\n            # init dataloader\n            dataset = TTSDataset(\n                outputs_per_step=config.r if \"r\" in config else 1,\n                compute_linear_spec=config.model.lower() == \"tacotron\" or config.compute_linear_spec,\n                compute_f0=config.get(\"compute_f0\", False),\n                f0_cache_path=config.get(\"f0_cache_path\", None),\n                compute_energy=config.get(\"compute_energy\", False),\n                energy_cache_path=config.get(\"energy_cache_path\", None),\n                samples=samples,\n                ap=self.ap,\n                return_wav=config.return_wav if \"return_wav\" in config else False,\n                batch_group_size=0 if is_eval else config.batch_group_size * config.batch_size,\n                min_text_len=config.min_text_len,\n                max_text_len=config.max_text_len,\n                min_audio_len=config.min_audio_len,\n                max_audio_len=config.max_audio_len,\n                phoneme_cache_path=config.phoneme_cache_path,\n                precompute_num_workers=config.precompute_num_workers,\n                use_noise_augment=False if is_eval else config.use_noise_augment,\n                verbose=verbose,\n                speaker_id_mapping=speaker_id_mapping,\n                d_vector_mapping=d_vector_mapping if config.use_d_vector_file else None,\n                tokenizer=self.tokenizer,\n                start_by_longest=config.start_by_longest,\n                language_id_mapping=language_id_mapping,\n            )\n\n            # wait all the DDP process to be ready\n            if num_gpus > 1:\n                dist.barrier()\n\n            # sort input sequences from short to long\n            dataset.preprocess_samples()\n\n            # get samplers\n            sampler = self.get_sampler(config, dataset, num_gpus)\n\n            loader = DataLoader(\n                dataset,\n                batch_size=config.eval_batch_size if is_eval else config.batch_size,\n                shuffle=config.shuffle if sampler is None else False,  # if there is no other sampler\n                collate_fn=dataset.collate_fn,\n                drop_last=config.drop_last,  # setting this False might cause issues in AMP training.\n                sampler=sampler,\n                num_workers=config.num_eval_loader_workers if is_eval else config.num_loader_workers,\n                pin_memory=False,\n            )\n        return loader\n\n    def _get_test_aux_input(\n        self,\n    ) -> Dict:\n        d_vector = None\n        if self.config.use_d_vector_file:\n            d_vector = [self.speaker_manager.embeddings[name][\"embedding\"] for name in self.speaker_manager.embeddings]\n            d_vector = (random.sample(sorted(d_vector), 1),)\n\n        aux_inputs = {\n            \"speaker_id\": None\n            if not self.config.use_speaker_embedding\n            else random.sample(sorted(self.speaker_manager.name_to_id.values()), 1),\n            \"d_vector\": d_vector,\n            \"style_wav\": None,  # TODO: handle GST style input\n        }\n        return aux_inputs\n\n    def test_run(self, assets: Dict) -> Tuple[Dict, Dict]:\n        \"\"\"Generic test run for `tts` models used by `Trainer`.\n\n        You can override this for a different behaviour.\n\n        Args:\n            assets (dict): A dict of training assets. For `tts` models, it must include `{'audio_processor': ap}`.\n\n        Returns:\n            Tuple[Dict, Dict]: Test figures and audios to be projected to Tensorboard.\n        \"\"\"\n        print(\" | > Synthesizing test sentences.\")\n        test_audios = {}\n        test_figures = {}\n        test_sentences = self.config.test_sentences\n        aux_inputs = self._get_test_aux_input()\n        for idx, sen in enumerate(test_sentences):\n            if isinstance(sen, list):\n                aux_inputs = self.get_aux_input_from_test_sentences(sen)\n                sen = aux_inputs[\"text\"]\n            outputs_dict = synthesis(\n                self,\n                sen,\n                self.config,\n                \"cuda\" in str(next(self.parameters()).device),\n                speaker_id=aux_inputs[\"speaker_id\"],\n                d_vector=aux_inputs[\"d_vector\"],\n                style_wav=aux_inputs[\"style_wav\"],\n                use_griffin_lim=True,\n                do_trim_silence=False,\n            )\n            test_audios[\"{}-audio\".format(idx)] = outputs_dict[\"wav\"]\n            test_figures[\"{}-prediction\".format(idx)] = plot_spectrogram(\n                outputs_dict[\"outputs\"][\"model_outputs\"], self.ap, output_fig=False\n            )\n            test_figures[\"{}-alignment\".format(idx)] = plot_alignment(\n                outputs_dict[\"outputs\"][\"alignments\"], output_fig=False\n            )\n        return test_figures, test_audios\n\n    def on_init_start(self, trainer):\n        \"\"\"Save the speaker.pth and language_ids.json at the beginning of the training. Also update both paths.\"\"\"\n        if self.speaker_manager is not None:\n            output_path = os.path.join(trainer.output_path, \"speakers.pth\")\n            self.speaker_manager.save_ids_to_file(output_path)\n            trainer.config.speakers_file = output_path\n            # some models don't have `model_args` set\n            if hasattr(trainer.config, \"model_args\"):\n                trainer.config.model_args.speakers_file = output_path\n            trainer.config.save_json(os.path.join(trainer.output_path, \"config.json\"))\n            print(f\" > `speakers.pth` is saved to {output_path}.\")\n            print(\" > `speakers_file` is updated in the config.json.\")\n\n        if self.language_manager is not None:\n            output_path = os.path.join(trainer.output_path, \"language_ids.json\")\n            self.language_manager.save_ids_to_file(output_path)\n            trainer.config.language_ids_file = output_path\n            if hasattr(trainer.config, \"model_args\"):\n                trainer.config.model_args.language_ids_file = output_path\n            trainer.config.save_json(os.path.join(trainer.output_path, \"config.json\"))\n            print(f\" > `language_ids.json` is saved to {output_path}.\")\n            print(\" > `language_ids_file` is updated in the config.json.\")\n"
  },
  {
    "path": "TTS/tts/models/forward_tts.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import Dict, List, Tuple, Union\n\nimport torch\nfrom coqpit import Coqpit\nfrom torch import nn\nfrom torch.cuda.amp.autocast_mode import autocast\n\nfrom TTS.tts.layers.feed_forward.decoder import Decoder\nfrom TTS.tts.layers.feed_forward.encoder import Encoder\nfrom TTS.tts.layers.generic.aligner import AlignmentNetwork\nfrom TTS.tts.layers.generic.pos_encoding import PositionalEncoding\nfrom TTS.tts.layers.glow_tts.duration_predictor import DurationPredictor\nfrom TTS.tts.models.base_tts import BaseTTS\nfrom TTS.tts.utils.helpers import average_over_durations, generate_path, maximum_path, sequence_mask\nfrom TTS.tts.utils.speakers import SpeakerManager\nfrom TTS.tts.utils.text.tokenizer import TTSTokenizer\nfrom TTS.tts.utils.visual import plot_alignment, plot_avg_energy, plot_avg_pitch, plot_spectrogram\nfrom TTS.utils.io import load_fsspec\n\n\n@dataclass\nclass ForwardTTSArgs(Coqpit):\n    \"\"\"ForwardTTS Model arguments.\n\n    Args:\n\n        num_chars (int):\n            Number of characters in the vocabulary. Defaults to 100.\n\n        out_channels (int):\n            Number of output channels. Defaults to 80.\n\n        hidden_channels (int):\n            Number of base hidden channels of the model. Defaults to 512.\n\n        use_aligner (bool):\n            Whether to use aligner network to learn the text to speech alignment or use pre-computed durations.\n            If set False, durations should be computed by `TTS/bin/compute_attention_masks.py` and path to the\n            pre-computed durations must be provided to `config.datasets[0].meta_file_attn_mask`. Defaults to True.\n\n        use_pitch (bool):\n            Use pitch predictor to learn the pitch. Defaults to True.\n\n        use_energy (bool):\n            Use energy predictor to learn the energy. Defaults to True.\n\n        duration_predictor_hidden_channels (int):\n            Number of hidden channels in the duration predictor. Defaults to 256.\n\n        duration_predictor_dropout_p (float):\n            Dropout rate for the duration predictor. Defaults to 0.1.\n\n        duration_predictor_kernel_size (int):\n            Kernel size of conv layers in the duration predictor. Defaults to 3.\n\n        pitch_predictor_hidden_channels (int):\n            Number of hidden channels in the pitch predictor. Defaults to 256.\n\n        pitch_predictor_dropout_p (float):\n            Dropout rate for the pitch predictor. Defaults to 0.1.\n\n        pitch_predictor_kernel_size (int):\n            Kernel size of conv layers in the pitch predictor. Defaults to 3.\n\n        pitch_embedding_kernel_size (int):\n            Kernel size of the projection layer in the pitch predictor. Defaults to 3.\n\n        energy_predictor_hidden_channels (int):\n            Number of hidden channels in the energy predictor. Defaults to 256.\n\n        energy_predictor_dropout_p (float):\n            Dropout rate for the energy predictor. Defaults to 0.1.\n\n        energy_predictor_kernel_size (int):\n            Kernel size of conv layers in the energy predictor. Defaults to 3.\n\n        energy_embedding_kernel_size (int):\n            Kernel size of the projection layer in the energy predictor. Defaults to 3.\n\n        positional_encoding (bool):\n            Whether to use positional encoding. Defaults to True.\n\n        positional_encoding_use_scale (bool):\n            Whether to use a learnable scale coeff in the positional encoding. Defaults to True.\n\n        length_scale (int):\n            Length scale that multiplies the predicted durations. Larger values result slower speech. Defaults to 1.0.\n\n        encoder_type (str):\n            Type of the encoder module. One of the encoders available in :class:`TTS.tts.layers.feed_forward.encoder`.\n            Defaults to `fftransformer` as in the paper.\n\n        encoder_params (dict):\n            Parameters of the encoder module. Defaults to ```{\"hidden_channels_ffn\": 1024, \"num_heads\": 1, \"num_layers\": 6, \"dropout_p\": 0.1}```\n\n        decoder_type (str):\n            Type of the decoder module. One of the decoders available in :class:`TTS.tts.layers.feed_forward.decoder`.\n            Defaults to `fftransformer` as in the paper.\n\n        decoder_params (str):\n            Parameters of the decoder module. Defaults to ```{\"hidden_channels_ffn\": 1024, \"num_heads\": 1, \"num_layers\": 6, \"dropout_p\": 0.1}```\n\n        detach_duration_predictor (bool):\n            Detach the input to the duration predictor from the earlier computation graph so that the duraiton loss\n            does not pass to the earlier layers. Defaults to True.\n\n        max_duration (int):\n            Maximum duration accepted by the model. Defaults to 75.\n\n        num_speakers (int):\n            Number of speakers for the speaker embedding layer. Defaults to 0.\n\n        speakers_file (str):\n            Path to the speaker mapping file for the Speaker Manager. Defaults to None.\n\n        speaker_embedding_channels (int):\n            Number of speaker embedding channels. Defaults to 256.\n\n        use_d_vector_file (bool):\n            Enable/Disable the use of d-vectors for multi-speaker training. Defaults to False.\n\n        d_vector_dim (int):\n            Number of d-vector channels. Defaults to 0.\n\n    \"\"\"\n\n    num_chars: int = None\n    out_channels: int = 80\n    hidden_channels: int = 384\n    use_aligner: bool = True\n    # pitch params\n    use_pitch: bool = True\n    pitch_predictor_hidden_channels: int = 256\n    pitch_predictor_kernel_size: int = 3\n    pitch_predictor_dropout_p: float = 0.1\n    pitch_embedding_kernel_size: int = 3\n\n    # energy params\n    use_energy: bool = False\n    energy_predictor_hidden_channels: int = 256\n    energy_predictor_kernel_size: int = 3\n    energy_predictor_dropout_p: float = 0.1\n    energy_embedding_kernel_size: int = 3\n\n    # duration params\n    duration_predictor_hidden_channels: int = 256\n    duration_predictor_kernel_size: int = 3\n    duration_predictor_dropout_p: float = 0.1\n\n    positional_encoding: bool = True\n    poisitonal_encoding_use_scale: bool = True\n    length_scale: int = 1\n    encoder_type: str = \"fftransformer\"\n    encoder_params: dict = field(\n        default_factory=lambda: {\"hidden_channels_ffn\": 1024, \"num_heads\": 1, \"num_layers\": 6, \"dropout_p\": 0.1}\n    )\n    decoder_type: str = \"fftransformer\"\n    decoder_params: dict = field(\n        default_factory=lambda: {\"hidden_channels_ffn\": 1024, \"num_heads\": 1, \"num_layers\": 6, \"dropout_p\": 0.1}\n    )\n    detach_duration_predictor: bool = False\n    max_duration: int = 75\n    num_speakers: int = 1\n    use_speaker_embedding: bool = False\n    speakers_file: str = None\n    use_d_vector_file: bool = False\n    d_vector_dim: int = None\n    d_vector_file: str = None\n\n\nclass ForwardTTS(BaseTTS):\n    \"\"\"General forward TTS model implementation that uses an encoder-decoder architecture with an optional alignment\n    network and a pitch predictor.\n\n    If the alignment network is used, the model learns the text-to-speech alignment\n    from the data instead of using pre-computed durations.\n\n    If the pitch predictor is used, the model trains a pitch predictor that predicts average pitch value for each\n    input character as in the FastPitch model.\n\n    `ForwardTTS` can be configured to one of these architectures,\n\n        - FastPitch\n        - SpeedySpeech\n        - FastSpeech\n        - FastSpeech2 (requires average speech energy predictor)\n\n    Args:\n        config (Coqpit): Model coqpit class.\n        speaker_manager (SpeakerManager): Speaker manager for multi-speaker training. Only used for multi-speaker models.\n            Defaults to None.\n\n    Examples:\n        >>> from TTS.tts.models.fast_pitch import ForwardTTS, ForwardTTSArgs\n        >>> config = ForwardTTSArgs()\n        >>> model = ForwardTTS(config)\n    \"\"\"\n\n    # pylint: disable=dangerous-default-value\n    def __init__(\n        self,\n        config: Coqpit,\n        ap: \"AudioProcessor\" = None,\n        tokenizer: \"TTSTokenizer\" = None,\n        speaker_manager: SpeakerManager = None,\n    ):\n        super().__init__(config, ap, tokenizer, speaker_manager)\n        self._set_model_args(config)\n\n        self.init_multispeaker(config)\n\n        self.max_duration = self.args.max_duration\n        self.use_aligner = self.args.use_aligner\n        self.use_pitch = self.args.use_pitch\n        self.use_energy = self.args.use_energy\n        self.binary_loss_weight = 0.0\n\n        self.length_scale = (\n            float(self.args.length_scale) if isinstance(self.args.length_scale, int) else self.args.length_scale\n        )\n\n        self.emb = nn.Embedding(self.args.num_chars, self.args.hidden_channels)\n\n        self.encoder = Encoder(\n            self.args.hidden_channels,\n            self.args.hidden_channels,\n            self.args.encoder_type,\n            self.args.encoder_params,\n            self.embedded_speaker_dim,\n        )\n\n        if self.args.positional_encoding:\n            self.pos_encoder = PositionalEncoding(self.args.hidden_channels)\n\n        self.decoder = Decoder(\n            self.args.out_channels,\n            self.args.hidden_channels,\n            self.args.decoder_type,\n            self.args.decoder_params,\n        )\n\n        self.duration_predictor = DurationPredictor(\n            self.args.hidden_channels + self.embedded_speaker_dim,\n            self.args.duration_predictor_hidden_channels,\n            self.args.duration_predictor_kernel_size,\n            self.args.duration_predictor_dropout_p,\n        )\n\n        if self.args.use_pitch:\n            self.pitch_predictor = DurationPredictor(\n                self.args.hidden_channels + self.embedded_speaker_dim,\n                self.args.pitch_predictor_hidden_channels,\n                self.args.pitch_predictor_kernel_size,\n                self.args.pitch_predictor_dropout_p,\n            )\n            self.pitch_emb = nn.Conv1d(\n                1,\n                self.args.hidden_channels,\n                kernel_size=self.args.pitch_embedding_kernel_size,\n                padding=int((self.args.pitch_embedding_kernel_size - 1) / 2),\n            )\n\n        if self.args.use_energy:\n            self.energy_predictor = DurationPredictor(\n                self.args.hidden_channels + self.embedded_speaker_dim,\n                self.args.energy_predictor_hidden_channels,\n                self.args.energy_predictor_kernel_size,\n                self.args.energy_predictor_dropout_p,\n            )\n            self.energy_emb = nn.Conv1d(\n                1,\n                self.args.hidden_channels,\n                kernel_size=self.args.energy_embedding_kernel_size,\n                padding=int((self.args.energy_embedding_kernel_size - 1) / 2),\n            )\n\n        if self.args.use_aligner:\n            self.aligner = AlignmentNetwork(\n                in_query_channels=self.args.out_channels, in_key_channels=self.args.hidden_channels\n            )\n\n    def init_multispeaker(self, config: Coqpit):\n        \"\"\"Init for multi-speaker training.\n\n        Args:\n            config (Coqpit): Model configuration.\n        \"\"\"\n        self.embedded_speaker_dim = 0\n        # init speaker manager\n        if self.speaker_manager is None and (config.use_d_vector_file or config.use_speaker_embedding):\n            raise ValueError(\n                \" > SpeakerManager is not provided. You must provide the SpeakerManager before initializing a multi-speaker model.\"\n            )\n        # set number of speakers\n        if self.speaker_manager is not None:\n            self.num_speakers = self.speaker_manager.num_speakers\n        # init d-vector embedding\n        if config.use_d_vector_file:\n            self.embedded_speaker_dim = config.d_vector_dim\n            if self.args.d_vector_dim != self.args.hidden_channels:\n                self.proj_g = nn.Conv1d(self.args.d_vector_dim, self.args.hidden_channels, 1)\n        # init speaker embedding layer\n        if config.use_speaker_embedding and not config.use_d_vector_file:\n            print(\" > Init speaker_embedding layer.\")\n            self.emb_g = nn.Embedding(self.num_speakers, self.args.hidden_channels)\n            nn.init.uniform_(self.emb_g.weight, -0.1, 0.1)\n\n    @staticmethod\n    def generate_attn(dr, x_mask, y_mask=None):\n        \"\"\"Generate an attention mask from the durations.\n\n        Shapes\n           - dr: :math:`(B, T_{en})`\n           - x_mask: :math:`(B, T_{en})`\n           - y_mask: :math:`(B, T_{de})`\n        \"\"\"\n        # compute decode mask from the durations\n        if y_mask is None:\n            y_lengths = dr.sum(1).long()\n            y_lengths[y_lengths < 1] = 1\n            y_mask = torch.unsqueeze(sequence_mask(y_lengths, None), 1).to(dr.dtype)\n        attn_mask = torch.unsqueeze(x_mask, -1) * torch.unsqueeze(y_mask, 2)\n        attn = generate_path(dr, attn_mask.squeeze(1)).to(dr.dtype)\n        return attn\n\n    def expand_encoder_outputs(self, en, dr, x_mask, y_mask):\n        \"\"\"Generate attention alignment map from durations and\n        expand encoder outputs\n\n        Shapes:\n            - en: :math:`(B, D_{en}, T_{en})`\n            - dr: :math:`(B, T_{en})`\n            - x_mask: :math:`(B, T_{en})`\n            - y_mask: :math:`(B, T_{de})`\n\n        Examples::\n\n            encoder output: [a,b,c,d]\n            durations: [1, 3, 2, 1]\n\n            expanded: [a, b, b, b, c, c, d]\n            attention map: [[0, 0, 0, 0, 0, 0, 1],\n                            [0, 0, 0, 0, 1, 1, 0],\n                            [0, 1, 1, 1, 0, 0, 0],\n                            [1, 0, 0, 0, 0, 0, 0]]\n        \"\"\"\n        attn = self.generate_attn(dr, x_mask, y_mask)\n        o_en_ex = torch.matmul(attn.squeeze(1).transpose(1, 2).to(en.dtype), en.transpose(1, 2)).transpose(1, 2)\n        return o_en_ex, attn\n\n    def format_durations(self, o_dr_log, x_mask):\n        \"\"\"Format predicted durations.\n        1. Convert to linear scale from log scale\n        2. Apply the length scale for speed adjustment\n        3. Apply masking.\n        4. Cast 0 durations to 1.\n        5. Round the duration values.\n\n        Args:\n            o_dr_log: Log scale durations.\n            x_mask: Input text mask.\n\n        Shapes:\n            - o_dr_log: :math:`(B, T_{de})`\n            - x_mask: :math:`(B, T_{en})`\n        \"\"\"\n        o_dr = (torch.exp(o_dr_log) - 1) * x_mask * self.length_scale\n        o_dr[o_dr < 1] = 1.0\n        o_dr = torch.round(o_dr)\n        return o_dr\n\n    def _forward_encoder(\n        self, x: torch.LongTensor, x_mask: torch.FloatTensor, g: torch.FloatTensor = None\n    ) -> Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor, torch.FloatTensor, torch.FloatTensor]:\n        \"\"\"Encoding forward pass.\n\n        1. Embed speaker IDs if multi-speaker mode.\n        2. Embed character sequences.\n        3. Run the encoder network.\n        4. Sum encoder outputs and speaker embeddings\n\n        Args:\n            x (torch.LongTensor): Input sequence IDs.\n            x_mask (torch.FloatTensor): Input squence mask.\n            g (torch.FloatTensor, optional): Conditioning vectors. In general speaker embeddings. Defaults to None.\n\n        Returns:\n            Tuple[torch.tensor, torch.tensor, torch.tensor, torch.tensor, torch.tensor]:\n                encoder output, encoder output for the duration predictor, input sequence mask, speaker embeddings,\n                character embeddings\n\n        Shapes:\n            - x: :math:`(B, T_{en})`\n            - x_mask: :math:`(B, 1, T_{en})`\n            - g: :math:`(B, C)`\n        \"\"\"\n        if hasattr(self, \"emb_g\"):\n            g = self.emb_g(g)  # [B, C, 1]\n        if g is not None:\n            g = g.unsqueeze(-1)\n        # [B, T, C]\n        x_emb = self.emb(x)\n        # encoder pass\n        o_en = self.encoder(torch.transpose(x_emb, 1, -1), x_mask)\n        # speaker conditioning\n        # TODO: try different ways of conditioning\n        if g is not None:\n            o_en = o_en + g\n        return o_en, x_mask, g, x_emb\n\n    def _forward_decoder(\n        self,\n        o_en: torch.FloatTensor,\n        dr: torch.IntTensor,\n        x_mask: torch.FloatTensor,\n        y_lengths: torch.IntTensor,\n        g: torch.FloatTensor,\n    ) -> Tuple[torch.FloatTensor, torch.FloatTensor]:\n        \"\"\"Decoding forward pass.\n\n        1. Compute the decoder output mask\n        2. Expand encoder output with the durations.\n        3. Apply position encoding.\n        4. Add speaker embeddings if multi-speaker mode.\n        5. Run the decoder.\n\n        Args:\n            o_en (torch.FloatTensor): Encoder output.\n            dr (torch.IntTensor): Ground truth durations or alignment network durations.\n            x_mask (torch.IntTensor): Input sequence mask.\n            y_lengths (torch.IntTensor): Output sequence lengths.\n            g (torch.FloatTensor): Conditioning vectors. In general speaker embeddings.\n\n        Returns:\n            Tuple[torch.FloatTensor, torch.FloatTensor]: Decoder output, attention map from durations.\n        \"\"\"\n        y_mask = torch.unsqueeze(sequence_mask(y_lengths, None), 1).to(o_en.dtype)\n        # expand o_en with durations\n        o_en_ex, attn = self.expand_encoder_outputs(o_en, dr, x_mask, y_mask)\n        # positional encoding\n        if hasattr(self, \"pos_encoder\"):\n            o_en_ex = self.pos_encoder(o_en_ex, y_mask)\n        # decoder pass\n        o_de = self.decoder(o_en_ex, y_mask, g=g)\n        return o_de.transpose(1, 2), attn.transpose(1, 2)\n\n    def _forward_pitch_predictor(\n        self,\n        o_en: torch.FloatTensor,\n        x_mask: torch.IntTensor,\n        pitch: torch.FloatTensor = None,\n        dr: torch.IntTensor = None,\n    ) -> Tuple[torch.FloatTensor, torch.FloatTensor]:\n        \"\"\"Pitch predictor forward pass.\n\n        1. Predict pitch from encoder outputs.\n        2. In training - Compute average pitch values for each input character from the ground truth pitch values.\n        3. Embed average pitch values.\n\n        Args:\n            o_en (torch.FloatTensor): Encoder output.\n            x_mask (torch.IntTensor): Input sequence mask.\n            pitch (torch.FloatTensor, optional): Ground truth pitch values. Defaults to None.\n            dr (torch.IntTensor, optional): Ground truth durations. Defaults to None.\n\n        Returns:\n            Tuple[torch.FloatTensor, torch.FloatTensor]: Pitch embedding, pitch prediction.\n\n        Shapes:\n            - o_en: :math:`(B, C, T_{en})`\n            - x_mask: :math:`(B, 1, T_{en})`\n            - pitch: :math:`(B, 1, T_{de})`\n            - dr: :math:`(B, T_{en})`\n        \"\"\"\n        o_pitch = self.pitch_predictor(o_en, x_mask)\n        if pitch is not None:\n            avg_pitch = average_over_durations(pitch, dr)\n            o_pitch_emb = self.pitch_emb(avg_pitch)\n            return o_pitch_emb, o_pitch, avg_pitch\n        o_pitch_emb = self.pitch_emb(o_pitch)\n        return o_pitch_emb, o_pitch\n\n    def _forward_energy_predictor(\n        self,\n        o_en: torch.FloatTensor,\n        x_mask: torch.IntTensor,\n        energy: torch.FloatTensor = None,\n        dr: torch.IntTensor = None,\n    ) -> Tuple[torch.FloatTensor, torch.FloatTensor]:\n        \"\"\"Energy predictor forward pass.\n\n        1. Predict energy from encoder outputs.\n        2. In training - Compute average pitch values for each input character from the ground truth pitch values.\n        3. Embed average energy values.\n\n        Args:\n            o_en (torch.FloatTensor): Encoder output.\n            x_mask (torch.IntTensor): Input sequence mask.\n            energy (torch.FloatTensor, optional): Ground truth energy values. Defaults to None.\n            dr (torch.IntTensor, optional): Ground truth durations. Defaults to None.\n\n        Returns:\n            Tuple[torch.FloatTensor, torch.FloatTensor]: Energy embedding, energy prediction.\n\n        Shapes:\n            - o_en: :math:`(B, C, T_{en})`\n            - x_mask: :math:`(B, 1, T_{en})`\n            - pitch: :math:`(B, 1, T_{de})`\n            - dr: :math:`(B, T_{en})`\n        \"\"\"\n        o_energy = self.energy_predictor(o_en, x_mask)\n        if energy is not None:\n            avg_energy = average_over_durations(energy, dr)\n            o_energy_emb = self.energy_emb(avg_energy)\n            return o_energy_emb, o_energy, avg_energy\n        o_energy_emb = self.energy_emb(o_energy)\n        return o_energy_emb, o_energy\n\n    def _forward_aligner(\n        self, x: torch.FloatTensor, y: torch.FloatTensor, x_mask: torch.IntTensor, y_mask: torch.IntTensor\n    ) -> Tuple[torch.IntTensor, torch.FloatTensor, torch.FloatTensor, torch.FloatTensor]:\n        \"\"\"Aligner forward pass.\n\n        1. Compute a mask to apply to the attention map.\n        2. Run the alignment network.\n        3. Apply MAS to compute the hard alignment map.\n        4. Compute the durations from the hard alignment map.\n\n        Args:\n            x (torch.FloatTensor): Input sequence.\n            y (torch.FloatTensor): Output sequence.\n            x_mask (torch.IntTensor): Input sequence mask.\n            y_mask (torch.IntTensor): Output sequence mask.\n\n        Returns:\n            Tuple[torch.IntTensor, torch.FloatTensor, torch.FloatTensor, torch.FloatTensor]:\n                Durations from the hard alignment map, soft alignment potentials, log scale alignment potentials,\n                hard alignment map.\n\n        Shapes:\n            - x: :math:`[B, T_en, C_en]`\n            - y: :math:`[B, T_de, C_de]`\n            - x_mask: :math:`[B, 1, T_en]`\n            - y_mask: :math:`[B, 1, T_de]`\n\n            - o_alignment_dur: :math:`[B, T_en]`\n            - alignment_soft: :math:`[B, T_en, T_de]`\n            - alignment_logprob: :math:`[B, 1, T_de, T_en]`\n            - alignment_mas: :math:`[B, T_en, T_de]`\n        \"\"\"\n        attn_mask = torch.unsqueeze(x_mask, -1) * torch.unsqueeze(y_mask, 2)\n        alignment_soft, alignment_logprob = self.aligner(y.transpose(1, 2), x.transpose(1, 2), x_mask, None)\n        alignment_mas = maximum_path(\n            alignment_soft.squeeze(1).transpose(1, 2).contiguous(), attn_mask.squeeze(1).contiguous()\n        )\n        o_alignment_dur = torch.sum(alignment_mas, -1).int()\n        alignment_soft = alignment_soft.squeeze(1).transpose(1, 2)\n        return o_alignment_dur, alignment_soft, alignment_logprob, alignment_mas\n\n    def _set_speaker_input(self, aux_input: Dict):\n        d_vectors = aux_input.get(\"d_vectors\", None)\n        speaker_ids = aux_input.get(\"speaker_ids\", None)\n\n        if d_vectors is not None and speaker_ids is not None:\n            raise ValueError(\"[!] Cannot use d-vectors and speaker-ids together.\")\n\n        if speaker_ids is not None and not hasattr(self, \"emb_g\"):\n            raise ValueError(\"[!] Cannot use speaker-ids without enabling speaker embedding.\")\n\n        g = speaker_ids if speaker_ids is not None else d_vectors\n        return g\n\n    def forward(\n        self,\n        x: torch.LongTensor,\n        x_lengths: torch.LongTensor,\n        y_lengths: torch.LongTensor,\n        y: torch.FloatTensor = None,\n        dr: torch.IntTensor = None,\n        pitch: torch.FloatTensor = None,\n        energy: torch.FloatTensor = None,\n        aux_input: Dict = {\"d_vectors\": None, \"speaker_ids\": None},  # pylint: disable=unused-argument\n    ) -> Dict:\n        \"\"\"Model's forward pass.\n\n        Args:\n            x (torch.LongTensor): Input character sequences.\n            x_lengths (torch.LongTensor): Input sequence lengths.\n            y_lengths (torch.LongTensor): Output sequnce lengths. Defaults to None.\n            y (torch.FloatTensor): Spectrogram frames. Only used when the alignment network is on. Defaults to None.\n            dr (torch.IntTensor): Character durations over the spectrogram frames. Only used when the alignment network is off. Defaults to None.\n            pitch (torch.FloatTensor): Pitch values for each spectrogram frame. Only used when the pitch predictor is on. Defaults to None.\n            energy (torch.FloatTensor): energy values for each spectrogram frame. Only used when the energy predictor is on. Defaults to None.\n            aux_input (Dict): Auxiliary model inputs for multi-speaker training. Defaults to `{\"d_vectors\": 0, \"speaker_ids\": None}`.\n\n        Shapes:\n            - x: :math:`[B, T_max]`\n            - x_lengths: :math:`[B]`\n            - y_lengths: :math:`[B]`\n            - y: :math:`[B, T_max2]`\n            - dr: :math:`[B, T_max]`\n            - g: :math:`[B, C]`\n            - pitch: :math:`[B, 1, T]`\n        \"\"\"\n        g = self._set_speaker_input(aux_input)\n        # compute sequence masks\n        y_mask = torch.unsqueeze(sequence_mask(y_lengths, None), 1).float()\n        x_mask = torch.unsqueeze(sequence_mask(x_lengths, x.shape[1]), 1).float()\n        # encoder pass\n        o_en, x_mask, g, x_emb = self._forward_encoder(x, x_mask, g)\n        # duration predictor pass\n        if self.args.detach_duration_predictor:\n            o_dr_log = self.duration_predictor(o_en.detach(), x_mask)\n        else:\n            o_dr_log = self.duration_predictor(o_en, x_mask)\n        o_dr = torch.clamp(torch.exp(o_dr_log) - 1, 0, self.max_duration)\n        # generate attn mask from predicted durations\n        o_attn = self.generate_attn(o_dr.squeeze(1), x_mask)\n        # aligner\n        o_alignment_dur = None\n        alignment_soft = None\n        alignment_logprob = None\n        alignment_mas = None\n        if self.use_aligner:\n            o_alignment_dur, alignment_soft, alignment_logprob, alignment_mas = self._forward_aligner(\n                x_emb, y, x_mask, y_mask\n            )\n            alignment_soft = alignment_soft.transpose(1, 2)\n            alignment_mas = alignment_mas.transpose(1, 2)\n            dr = o_alignment_dur\n        # pitch predictor pass\n        o_pitch = None\n        avg_pitch = None\n        if self.args.use_pitch:\n            o_pitch_emb, o_pitch, avg_pitch = self._forward_pitch_predictor(o_en, x_mask, pitch, dr)\n            o_en = o_en + o_pitch_emb\n        # energy predictor pass\n        o_energy = None\n        avg_energy = None\n        if self.args.use_energy:\n            o_energy_emb, o_energy, avg_energy = self._forward_energy_predictor(o_en, x_mask, energy, dr)\n            o_en = o_en + o_energy_emb\n        # decoder pass\n        o_de, attn = self._forward_decoder(\n            o_en, dr, x_mask, y_lengths, g=None\n        )  # TODO: maybe pass speaker embedding (g) too\n        outputs = {\n            \"model_outputs\": o_de,  # [B, T, C]\n            \"durations_log\": o_dr_log.squeeze(1),  # [B, T]\n            \"durations\": o_dr.squeeze(1),  # [B, T]\n            \"attn_durations\": o_attn,  # for visualization [B, T_en, T_de']\n            \"pitch_avg\": o_pitch,\n            \"pitch_avg_gt\": avg_pitch,\n            \"energy_avg\": o_energy,\n            \"energy_avg_gt\": avg_energy,\n            \"alignments\": attn,  # [B, T_de, T_en]\n            \"alignment_soft\": alignment_soft,\n            \"alignment_mas\": alignment_mas,\n            \"o_alignment_dur\": o_alignment_dur,\n            \"alignment_logprob\": alignment_logprob,\n            \"x_mask\": x_mask,\n            \"y_mask\": y_mask,\n        }\n        return outputs\n\n    @torch.no_grad()\n    def inference(self, x, aux_input={\"d_vectors\": None, \"speaker_ids\": None}):  # pylint: disable=unused-argument\n        \"\"\"Model's inference pass.\n\n        Args:\n            x (torch.LongTensor): Input character sequence.\n            aux_input (Dict): Auxiliary model inputs. Defaults to `{\"d_vectors\": None, \"speaker_ids\": None}`.\n\n        Shapes:\n            - x: [B, T_max]\n            - x_lengths: [B]\n            - g: [B, C]\n        \"\"\"\n        g = self._set_speaker_input(aux_input)\n        x_lengths = torch.tensor(x.shape[1:2]).to(x.device)\n        x_mask = torch.unsqueeze(sequence_mask(x_lengths, x.shape[1]), 1).to(x.dtype).float()\n        # encoder pass\n        o_en, x_mask, g, _ = self._forward_encoder(x, x_mask, g)\n        # duration predictor pass\n        o_dr_log = self.duration_predictor(o_en, x_mask)\n        o_dr = self.format_durations(o_dr_log, x_mask).squeeze(1)\n        y_lengths = o_dr.sum(1)\n        # pitch predictor pass\n        o_pitch = None\n        if self.args.use_pitch:\n            o_pitch_emb, o_pitch = self._forward_pitch_predictor(o_en, x_mask)\n            o_en = o_en + o_pitch_emb\n        # energy predictor pass\n        o_energy = None\n        if self.args.use_energy:\n            o_energy_emb, o_energy = self._forward_energy_predictor(o_en, x_mask)\n            o_en = o_en + o_energy_emb\n        # decoder pass\n        o_de, attn = self._forward_decoder(o_en, o_dr, x_mask, y_lengths, g=None)\n        outputs = {\n            \"model_outputs\": o_de,\n            \"alignments\": attn,\n            \"pitch\": o_pitch,\n            \"energy\": o_energy,\n            \"durations_log\": o_dr_log,\n        }\n        return outputs\n\n    def train_step(self, batch: dict, criterion: nn.Module):\n        text_input = batch[\"text_input\"]\n        text_lengths = batch[\"text_lengths\"]\n        mel_input = batch[\"mel_input\"]\n        mel_lengths = batch[\"mel_lengths\"]\n        pitch = batch[\"pitch\"] if self.args.use_pitch else None\n        energy = batch[\"energy\"] if self.args.use_energy else None\n        d_vectors = batch[\"d_vectors\"]\n        speaker_ids = batch[\"speaker_ids\"]\n        durations = batch[\"durations\"]\n        aux_input = {\"d_vectors\": d_vectors, \"speaker_ids\": speaker_ids}\n\n        # forward pass\n        outputs = self.forward(\n            text_input,\n            text_lengths,\n            mel_lengths,\n            y=mel_input,\n            dr=durations,\n            pitch=pitch,\n            energy=energy,\n            aux_input=aux_input,\n        )\n        # use aligner's output as the duration target\n        if self.use_aligner:\n            durations = outputs[\"o_alignment_dur\"]\n        # use float32 in AMP\n        with autocast(enabled=False):\n            # compute loss\n            loss_dict = criterion(\n                decoder_output=outputs[\"model_outputs\"],\n                decoder_target=mel_input,\n                decoder_output_lens=mel_lengths,\n                dur_output=outputs[\"durations_log\"],\n                dur_target=durations,\n                pitch_output=outputs[\"pitch_avg\"] if self.use_pitch else None,\n                pitch_target=outputs[\"pitch_avg_gt\"] if self.use_pitch else None,\n                energy_output=outputs[\"energy_avg\"] if self.use_energy else None,\n                energy_target=outputs[\"energy_avg_gt\"] if self.use_energy else None,\n                input_lens=text_lengths,\n                alignment_logprob=outputs[\"alignment_logprob\"] if self.use_aligner else None,\n                alignment_soft=outputs[\"alignment_soft\"],\n                alignment_hard=outputs[\"alignment_mas\"],\n                binary_loss_weight=self.binary_loss_weight,\n            )\n            # compute duration error\n            durations_pred = outputs[\"durations\"]\n            duration_error = torch.abs(durations - durations_pred).sum() / text_lengths.sum()\n            loss_dict[\"duration_error\"] = duration_error\n\n        return outputs, loss_dict\n\n    def _create_logs(self, batch, outputs, ap):\n        \"\"\"Create common logger outputs.\"\"\"\n        model_outputs = outputs[\"model_outputs\"]\n        alignments = outputs[\"alignments\"]\n        mel_input = batch[\"mel_input\"]\n\n        pred_spec = model_outputs[0].data.cpu().numpy()\n        gt_spec = mel_input[0].data.cpu().numpy()\n        align_img = alignments[0].data.cpu().numpy()\n\n        figures = {\n            \"prediction\": plot_spectrogram(pred_spec, ap, output_fig=False),\n            \"ground_truth\": plot_spectrogram(gt_spec, ap, output_fig=False),\n            \"alignment\": plot_alignment(align_img, output_fig=False),\n        }\n\n        # plot pitch figures\n        if self.args.use_pitch:\n            pitch_avg = abs(outputs[\"pitch_avg_gt\"][0, 0].data.cpu().numpy())\n            pitch_avg_hat = abs(outputs[\"pitch_avg\"][0, 0].data.cpu().numpy())\n            chars = self.tokenizer.decode(batch[\"text_input\"][0].data.cpu().numpy())\n            pitch_figures = {\n                \"pitch_ground_truth\": plot_avg_pitch(pitch_avg, chars, output_fig=False),\n                \"pitch_avg_predicted\": plot_avg_pitch(pitch_avg_hat, chars, output_fig=False),\n            }\n            figures.update(pitch_figures)\n\n        # plot energy figures\n        if self.args.use_energy:\n            energy_avg = abs(outputs[\"energy_avg_gt\"][0, 0].data.cpu().numpy())\n            energy_avg_hat = abs(outputs[\"energy_avg\"][0, 0].data.cpu().numpy())\n            chars = self.tokenizer.decode(batch[\"text_input\"][0].data.cpu().numpy())\n            energy_figures = {\n                \"energy_ground_truth\": plot_avg_energy(energy_avg, chars, output_fig=False),\n                \"energy_avg_predicted\": plot_avg_energy(energy_avg_hat, chars, output_fig=False),\n            }\n            figures.update(energy_figures)\n\n        # plot the attention mask computed from the predicted durations\n        if \"attn_durations\" in outputs:\n            alignments_hat = outputs[\"attn_durations\"][0].data.cpu().numpy()\n            figures[\"alignment_hat\"] = plot_alignment(alignments_hat.T, output_fig=False)\n\n        # Sample audio\n        train_audio = ap.inv_melspectrogram(pred_spec.T)\n        return figures, {\"audio\": train_audio}\n\n    def train_log(\n        self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int\n    ) -> None:  # pylint: disable=no-self-use\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.train_figures(steps, figures)\n        logger.train_audios(steps, audios, self.ap.sample_rate)\n\n    def eval_step(self, batch: dict, criterion: nn.Module):\n        return self.train_step(batch, criterion)\n\n    def eval_log(self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int) -> None:\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.eval_figures(steps, figures)\n        logger.eval_audios(steps, audios, self.ap.sample_rate)\n\n    def load_checkpoint(\n        self, config, checkpoint_path, eval=False, cache=False\n    ):  # pylint: disable=unused-argument, redefined-builtin\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), cache=cache)\n        self.load_state_dict(state[\"model\"])\n        if eval:\n            self.eval()\n            assert not self.training\n\n    def get_criterion(self):\n        from TTS.tts.layers.losses import ForwardTTSLoss  # pylint: disable=import-outside-toplevel\n\n        return ForwardTTSLoss(self.config)\n\n    def on_train_step_start(self, trainer):\n        \"\"\"Schedule binary loss weight.\"\"\"\n        self.binary_loss_weight = min(trainer.epochs_done / self.config.binary_loss_warmup_epochs, 1.0) * 1.0\n\n    @staticmethod\n    def init_from_config(config: \"ForwardTTSConfig\", samples: Union[List[List], List[Dict]] = None):\n        \"\"\"Initiate model from config\n\n        Args:\n            config (ForwardTTSConfig): Model config.\n            samples (Union[List[List], List[Dict]]): Training samples to parse speaker ids for training.\n                Defaults to None.\n        \"\"\"\n        from TTS.utils.audio import AudioProcessor\n\n        ap = AudioProcessor.init_from_config(config)\n        tokenizer, new_config = TTSTokenizer.init_from_config(config)\n        speaker_manager = SpeakerManager.init_from_config(config, samples)\n        return ForwardTTS(new_config, ap, tokenizer, speaker_manager)\n"
  },
  {
    "path": "TTS/tts/models/glow_tts.py",
    "content": "import math\nfrom typing import Dict, List, Tuple, Union\n\nimport torch\nfrom coqpit import Coqpit\nfrom torch import nn\nfrom torch.cuda.amp.autocast_mode import autocast\nfrom torch.nn import functional as F\n\nfrom TTS.tts.configs.glow_tts_config import GlowTTSConfig\nfrom TTS.tts.layers.glow_tts.decoder import Decoder\nfrom TTS.tts.layers.glow_tts.encoder import Encoder\nfrom TTS.tts.models.base_tts import BaseTTS\nfrom TTS.tts.utils.helpers import generate_path, maximum_path, sequence_mask\nfrom TTS.tts.utils.speakers import SpeakerManager\nfrom TTS.tts.utils.synthesis import synthesis\nfrom TTS.tts.utils.text.tokenizer import TTSTokenizer\nfrom TTS.tts.utils.visual import plot_alignment, plot_spectrogram\nfrom TTS.utils.io import load_fsspec\n\n\nclass GlowTTS(BaseTTS):\n    \"\"\"GlowTTS model.\n\n    Paper::\n        https://arxiv.org/abs/2005.11129\n\n    Paper abstract::\n        Recently, text-to-speech (TTS) models such as FastSpeech and ParaNet have been proposed to generate\n        mel-spectrograms from text in parallel. Despite the advantage, the parallel TTS models cannot be trained\n        without guidance from autoregressive TTS models as their external aligners. In this work, we propose Glow-TTS,\n        a flow-based generative model for parallel TTS that does not require any external aligner. By combining the\n        properties of flows and dynamic programming, the proposed model searches for the most probable monotonic\n        alignment between text and the latent representation of speech on its own. We demonstrate that enforcing hard\n        monotonic alignments enables robust TTS, which generalizes to long utterances, and employing generative flows\n        enables fast, diverse, and controllable speech synthesis. Glow-TTS obtains an order-of-magnitude speed-up over\n        the autoregressive model, Tacotron 2, at synthesis with comparable speech quality. We further show that our\n        model can be easily extended to a multi-speaker setting.\n\n    Check :class:`TTS.tts.configs.glow_tts_config.GlowTTSConfig` for class arguments.\n\n    Examples:\n        Init only model layers.\n\n        >>> from TTS.tts.configs.glow_tts_config import GlowTTSConfig\n        >>> from TTS.tts.models.glow_tts import GlowTTS\n        >>> config = GlowTTSConfig(num_chars=2)\n        >>> model = GlowTTS(config)\n\n        Fully init a model ready for action. All the class attributes and class members\n        (e.g Tokenizer, AudioProcessor, etc.). are initialized internally based on config values.\n\n        >>> from TTS.tts.configs.glow_tts_config import GlowTTSConfig\n        >>> from TTS.tts.models.glow_tts import GlowTTS\n        >>> config = GlowTTSConfig()\n        >>> model = GlowTTS.init_from_config(config, verbose=False)\n    \"\"\"\n\n    def __init__(\n        self,\n        config: GlowTTSConfig,\n        ap: \"AudioProcessor\" = None,\n        tokenizer: \"TTSTokenizer\" = None,\n        speaker_manager: SpeakerManager = None,\n    ):\n        super().__init__(config, ap, tokenizer, speaker_manager)\n\n        # pass all config fields to `self`\n        # for fewer code change\n        self.config = config\n        for key in config:\n            setattr(self, key, config[key])\n\n        self.decoder_output_dim = config.out_channels\n\n        # init multi-speaker layers if necessary\n        self.init_multispeaker(config)\n\n        self.run_data_dep_init = config.data_dep_init_steps > 0\n        self.encoder = Encoder(\n            self.num_chars,\n            out_channels=self.out_channels,\n            hidden_channels=self.hidden_channels_enc,\n            hidden_channels_dp=self.hidden_channels_dp,\n            encoder_type=self.encoder_type,\n            encoder_params=self.encoder_params,\n            mean_only=self.mean_only,\n            use_prenet=self.use_encoder_prenet,\n            dropout_p_dp=self.dropout_p_dp,\n            c_in_channels=self.c_in_channels,\n        )\n\n        self.decoder = Decoder(\n            self.out_channels,\n            self.hidden_channels_dec,\n            self.kernel_size_dec,\n            self.dilation_rate,\n            self.num_flow_blocks_dec,\n            self.num_block_layers,\n            dropout_p=self.dropout_p_dec,\n            num_splits=self.num_splits,\n            num_squeeze=self.num_squeeze,\n            sigmoid_scale=self.sigmoid_scale,\n            c_in_channels=self.c_in_channels,\n        )\n\n    def init_multispeaker(self, config: Coqpit):\n        \"\"\"Init speaker embedding layer if `use_speaker_embedding` is True and set the expected speaker embedding\n        vector dimension to the encoder layer channel size. If model uses d-vectors, then it only sets\n        speaker embedding vector dimension to the d-vector dimension from the config.\n\n        Args:\n            config (Coqpit): Model configuration.\n        \"\"\"\n        self.embedded_speaker_dim = 0\n        # set number of speakers - if num_speakers is set in config, use it, otherwise use speaker_manager\n        if self.speaker_manager is not None:\n            self.num_speakers = self.speaker_manager.num_speakers\n        # set ultimate speaker embedding size\n        if config.use_d_vector_file:\n            self.embedded_speaker_dim = (\n                config.d_vector_dim if \"d_vector_dim\" in config and config.d_vector_dim is not None else 512\n            )\n            if self.speaker_manager is not None:\n                assert (\n                    config.d_vector_dim == self.speaker_manager.embedding_dim\n                ), \" [!] d-vector dimension mismatch b/w config and speaker manager.\"\n        # init speaker embedding layer\n        if config.use_speaker_embedding and not config.use_d_vector_file:\n            print(\" > Init speaker_embedding layer.\")\n            self.embedded_speaker_dim = self.hidden_channels_enc\n            self.emb_g = nn.Embedding(self.num_speakers, self.hidden_channels_enc)\n            nn.init.uniform_(self.emb_g.weight, -0.1, 0.1)\n        # set conditioning dimensions\n        self.c_in_channels = self.embedded_speaker_dim\n\n    @staticmethod\n    def compute_outputs(attn, o_mean, o_log_scale, x_mask):\n        \"\"\"Compute and format the mode outputs with the given alignment map\"\"\"\n        y_mean = torch.matmul(attn.squeeze(1).transpose(1, 2), o_mean.transpose(1, 2)).transpose(\n            1, 2\n        )  # [b, t', t], [b, t, d] -> [b, d, t']\n        y_log_scale = torch.matmul(attn.squeeze(1).transpose(1, 2), o_log_scale.transpose(1, 2)).transpose(\n            1, 2\n        )  # [b, t', t], [b, t, d] -> [b, d, t']\n        # compute total duration with adjustment\n        o_attn_dur = torch.log(1 + torch.sum(attn, -1)) * x_mask\n        return y_mean, y_log_scale, o_attn_dur\n\n    def unlock_act_norm_layers(self):\n        \"\"\"Unlock activation normalization layers for data depended initalization.\"\"\"\n        for f in self.decoder.flows:\n            if getattr(f, \"set_ddi\", False):\n                f.set_ddi(True)\n\n    def lock_act_norm_layers(self):\n        \"\"\"Lock activation normalization layers.\"\"\"\n        for f in self.decoder.flows:\n            if getattr(f, \"set_ddi\", False):\n                f.set_ddi(False)\n\n    def _set_speaker_input(self, aux_input: Dict):\n        if aux_input is None:\n            d_vectors = None\n            speaker_ids = None\n        else:\n            d_vectors = aux_input.get(\"d_vectors\", None)\n            speaker_ids = aux_input.get(\"speaker_ids\", None)\n\n        if d_vectors is not None and speaker_ids is not None:\n            raise ValueError(\"[!] Cannot use d-vectors and speaker-ids together.\")\n\n        if speaker_ids is not None and not hasattr(self, \"emb_g\"):\n            raise ValueError(\"[!] Cannot use speaker-ids without enabling speaker embedding.\")\n\n        g = speaker_ids if speaker_ids is not None else d_vectors\n        return g\n\n    def _speaker_embedding(self, aux_input: Dict) -> Union[torch.tensor, None]:\n        g = self._set_speaker_input(aux_input)\n        # speaker embedding\n        if g is not None:\n            if hasattr(self, \"emb_g\"):\n                # use speaker embedding layer\n                if not g.size():  # if is a scalar\n                    g = g.unsqueeze(0)  # unsqueeze\n                g = F.normalize(self.emb_g(g)).unsqueeze(-1)  # [b, h, 1]\n            else:\n                # use d-vector\n                g = F.normalize(g).unsqueeze(-1)  # [b, h, 1]\n        return g\n\n    def forward(\n        self, x, x_lengths, y, y_lengths=None, aux_input={\"d_vectors\": None, \"speaker_ids\": None}\n    ):  # pylint: disable=dangerous-default-value\n        \"\"\"\n        Args:\n            x (torch.Tensor):\n                Input text sequence ids. :math:`[B, T_en]`\n\n            x_lengths (torch.Tensor):\n                Lengths of input text sequences. :math:`[B]`\n\n            y (torch.Tensor):\n                Target mel-spectrogram frames. :math:`[B, T_de, C_mel]`\n\n            y_lengths (torch.Tensor):\n                Lengths of target mel-spectrogram frames. :math:`[B]`\n\n            aux_input (Dict):\n                Auxiliary inputs. `d_vectors` is speaker embedding vectors for a multi-speaker model.\n                :math:`[B, D_vec]`. `speaker_ids` is speaker ids for a multi-speaker model usind speaker-embedding\n                layer. :math:`B`\n\n        Returns:\n            Dict:\n                - z: :math: `[B, T_de, C]`\n                - logdet: :math:`B`\n                - y_mean: :math:`[B, T_de, C]`\n                - y_log_scale: :math:`[B, T_de, C]`\n                - alignments: :math:`[B, T_en, T_de]`\n                - durations_log: :math:`[B, T_en, 1]`\n                - total_durations_log: :math:`[B, T_en, 1]`\n        \"\"\"\n        # [B, T, C] -> [B, C, T]\n        y = y.transpose(1, 2)\n        y_max_length = y.size(2)\n        # norm speaker embeddings\n        g = self._speaker_embedding(aux_input)\n        # embedding pass\n        o_mean, o_log_scale, o_dur_log, x_mask = self.encoder(x, x_lengths, g=g)\n        # drop redisual frames wrt num_squeeze and set y_lengths.\n        y, y_lengths, y_max_length, attn = self.preprocess(y, y_lengths, y_max_length, None)\n        # create masks\n        y_mask = torch.unsqueeze(sequence_mask(y_lengths, y_max_length), 1).to(x_mask.dtype)\n        # [B, 1, T_en, T_de]\n        attn_mask = torch.unsqueeze(x_mask, -1) * torch.unsqueeze(y_mask, 2)\n        # decoder pass\n        z, logdet = self.decoder(y, y_mask, g=g, reverse=False)\n        # find the alignment path\n        with torch.no_grad():\n            o_scale = torch.exp(-2 * o_log_scale)\n            logp1 = torch.sum(-0.5 * math.log(2 * math.pi) - o_log_scale, [1]).unsqueeze(-1)  # [b, t, 1]\n            logp2 = torch.matmul(o_scale.transpose(1, 2), -0.5 * (z**2))  # [b, t, d] x [b, d, t'] = [b, t, t']\n            logp3 = torch.matmul((o_mean * o_scale).transpose(1, 2), z)  # [b, t, d] x [b, d, t'] = [b, t, t']\n            logp4 = torch.sum(-0.5 * (o_mean**2) * o_scale, [1]).unsqueeze(-1)  # [b, t, 1]\n            logp = logp1 + logp2 + logp3 + logp4  # [b, t, t']\n            attn = maximum_path(logp, attn_mask.squeeze(1)).unsqueeze(1).detach()\n        y_mean, y_log_scale, o_attn_dur = self.compute_outputs(attn, o_mean, o_log_scale, x_mask)\n        attn = attn.squeeze(1).permute(0, 2, 1)\n        outputs = {\n            \"z\": z.transpose(1, 2),\n            \"logdet\": logdet,\n            \"y_mean\": y_mean.transpose(1, 2),\n            \"y_log_scale\": y_log_scale.transpose(1, 2),\n            \"alignments\": attn,\n            \"durations_log\": o_dur_log.transpose(1, 2),\n            \"total_durations_log\": o_attn_dur.transpose(1, 2),\n        }\n        return outputs\n\n    @torch.no_grad()\n    def inference_with_MAS(\n        self, x, x_lengths, y=None, y_lengths=None, aux_input={\"d_vectors\": None, \"speaker_ids\": None}\n    ):  # pylint: disable=dangerous-default-value\n        \"\"\"\n        It's similar to the teacher forcing in Tacotron.\n        It was proposed in: https://arxiv.org/abs/2104.05557\n\n        Shapes:\n            - x: :math:`[B, T]`\n            - x_lenghts: :math:`B`\n            - y: :math:`[B, T, C]`\n            - y_lengths: :math:`B`\n            - g: :math:`[B, C] or B`\n        \"\"\"\n        y = y.transpose(1, 2)\n        y_max_length = y.size(2)\n        # norm speaker embeddings\n        g = self._speaker_embedding(aux_input)\n        # embedding pass\n        o_mean, o_log_scale, o_dur_log, x_mask = self.encoder(x, x_lengths, g=g)\n        # drop redisual frames wrt num_squeeze and set y_lengths.\n        y, y_lengths, y_max_length, attn = self.preprocess(y, y_lengths, y_max_length, None)\n        # create masks\n        y_mask = torch.unsqueeze(sequence_mask(y_lengths, y_max_length), 1).to(x_mask.dtype)\n        attn_mask = torch.unsqueeze(x_mask, -1) * torch.unsqueeze(y_mask, 2)\n        # decoder pass\n        z, logdet = self.decoder(y, y_mask, g=g, reverse=False)\n        # find the alignment path between z and encoder output\n        o_scale = torch.exp(-2 * o_log_scale)\n        logp1 = torch.sum(-0.5 * math.log(2 * math.pi) - o_log_scale, [1]).unsqueeze(-1)  # [b, t, 1]\n        logp2 = torch.matmul(o_scale.transpose(1, 2), -0.5 * (z**2))  # [b, t, d] x [b, d, t'] = [b, t, t']\n        logp3 = torch.matmul((o_mean * o_scale).transpose(1, 2), z)  # [b, t, d] x [b, d, t'] = [b, t, t']\n        logp4 = torch.sum(-0.5 * (o_mean**2) * o_scale, [1]).unsqueeze(-1)  # [b, t, 1]\n        logp = logp1 + logp2 + logp3 + logp4  # [b, t, t']\n        attn = maximum_path(logp, attn_mask.squeeze(1)).unsqueeze(1).detach()\n\n        y_mean, y_log_scale, o_attn_dur = self.compute_outputs(attn, o_mean, o_log_scale, x_mask)\n        attn = attn.squeeze(1).permute(0, 2, 1)\n\n        # get predited aligned distribution\n        z = y_mean * y_mask\n\n        # reverse the decoder and predict using the aligned distribution\n        y, logdet = self.decoder(z, y_mask, g=g, reverse=True)\n        outputs = {\n            \"model_outputs\": z.transpose(1, 2),\n            \"logdet\": logdet,\n            \"y_mean\": y_mean.transpose(1, 2),\n            \"y_log_scale\": y_log_scale.transpose(1, 2),\n            \"alignments\": attn,\n            \"durations_log\": o_dur_log.transpose(1, 2),\n            \"total_durations_log\": o_attn_dur.transpose(1, 2),\n        }\n        return outputs\n\n    @torch.no_grad()\n    def decoder_inference(\n        self, y, y_lengths=None, aux_input={\"d_vectors\": None, \"speaker_ids\": None}\n    ):  # pylint: disable=dangerous-default-value\n        \"\"\"\n        Shapes:\n            - y: :math:`[B, T, C]`\n            - y_lengths: :math:`B`\n            - g: :math:`[B, C] or B`\n        \"\"\"\n        y = y.transpose(1, 2)\n        y_max_length = y.size(2)\n        g = self._speaker_embedding(aux_input)\n        y_mask = torch.unsqueeze(sequence_mask(y_lengths, y_max_length), 1).to(y.dtype)\n        # decoder pass\n        z, logdet = self.decoder(y, y_mask, g=g, reverse=False)\n        # reverse decoder and predict\n        y, logdet = self.decoder(z, y_mask, g=g, reverse=True)\n        outputs = {}\n        outputs[\"model_outputs\"] = y.transpose(1, 2)\n        outputs[\"logdet\"] = logdet\n        return outputs\n\n    @torch.no_grad()\n    def inference(\n        self, x, aux_input={\"x_lengths\": None, \"d_vectors\": None, \"speaker_ids\": None}\n    ):  # pylint: disable=dangerous-default-value\n        x_lengths = aux_input[\"x_lengths\"]\n        g = self._speaker_embedding(aux_input)\n        # embedding pass\n        o_mean, o_log_scale, o_dur_log, x_mask = self.encoder(x, x_lengths, g=g)\n        # compute output durations\n        w = (torch.exp(o_dur_log) - 1) * x_mask * self.length_scale\n        w_ceil = torch.clamp_min(torch.ceil(w), 1)\n        y_lengths = torch.clamp_min(torch.sum(w_ceil, [1, 2]), 1).long()\n        y_max_length = None\n        # compute masks\n        y_mask = torch.unsqueeze(sequence_mask(y_lengths, y_max_length), 1).to(x_mask.dtype)\n        attn_mask = torch.unsqueeze(x_mask, -1) * torch.unsqueeze(y_mask, 2)\n        # compute attention mask\n        attn = generate_path(w_ceil.squeeze(1), attn_mask.squeeze(1)).unsqueeze(1)\n        y_mean, y_log_scale, o_attn_dur = self.compute_outputs(attn, o_mean, o_log_scale, x_mask)\n\n        z = (y_mean + torch.exp(y_log_scale) * torch.randn_like(y_mean) * self.inference_noise_scale) * y_mask\n        # decoder pass\n        y, logdet = self.decoder(z, y_mask, g=g, reverse=True)\n        attn = attn.squeeze(1).permute(0, 2, 1)\n        outputs = {\n            \"model_outputs\": y.transpose(1, 2),\n            \"logdet\": logdet,\n            \"y_mean\": y_mean.transpose(1, 2),\n            \"y_log_scale\": y_log_scale.transpose(1, 2),\n            \"alignments\": attn,\n            \"durations_log\": o_dur_log.transpose(1, 2),\n            \"total_durations_log\": o_attn_dur.transpose(1, 2),\n        }\n        return outputs\n\n    def train_step(self, batch: dict, criterion: nn.Module):\n        \"\"\"A single training step. Forward pass and loss computation. Run data depended initialization for the\n        first `config.data_dep_init_steps` steps.\n\n        Args:\n            batch (dict): [description]\n            criterion (nn.Module): [description]\n        \"\"\"\n        text_input = batch[\"text_input\"]\n        text_lengths = batch[\"text_lengths\"]\n        mel_input = batch[\"mel_input\"]\n        mel_lengths = batch[\"mel_lengths\"]\n        d_vectors = batch[\"d_vectors\"]\n        speaker_ids = batch[\"speaker_ids\"]\n\n        if self.run_data_dep_init and self.training:\n            # compute data-dependent initialization of activation norm layers\n            self.unlock_act_norm_layers()\n            with torch.no_grad():\n                _ = self.forward(\n                    text_input,\n                    text_lengths,\n                    mel_input,\n                    mel_lengths,\n                    aux_input={\"d_vectors\": d_vectors, \"speaker_ids\": speaker_ids},\n                )\n            outputs = None\n            loss_dict = None\n            self.lock_act_norm_layers()\n        else:\n            # normal training step\n            outputs = self.forward(\n                text_input,\n                text_lengths,\n                mel_input,\n                mel_lengths,\n                aux_input={\"d_vectors\": d_vectors, \"speaker_ids\": speaker_ids},\n            )\n\n            with autocast(enabled=False):  # avoid mixed_precision in criterion\n                loss_dict = criterion(\n                    outputs[\"z\"].float(),\n                    outputs[\"y_mean\"].float(),\n                    outputs[\"y_log_scale\"].float(),\n                    outputs[\"logdet\"].float(),\n                    mel_lengths,\n                    outputs[\"durations_log\"].float(),\n                    outputs[\"total_durations_log\"].float(),\n                    text_lengths,\n                )\n        return outputs, loss_dict\n\n    def _create_logs(self, batch, outputs, ap):\n        alignments = outputs[\"alignments\"]\n        text_input = batch[\"text_input\"][:1] if batch[\"text_input\"] is not None else None\n        text_lengths = batch[\"text_lengths\"]\n        mel_input = batch[\"mel_input\"]\n        d_vectors = batch[\"d_vectors\"][:1] if batch[\"d_vectors\"] is not None else None\n        speaker_ids = batch[\"speaker_ids\"][:1] if batch[\"speaker_ids\"] is not None else None\n\n        # model runs reverse flow to predict spectrograms\n        pred_outputs = self.inference(\n            text_input,\n            aux_input={\"x_lengths\": text_lengths[:1], \"d_vectors\": d_vectors, \"speaker_ids\": speaker_ids},\n        )\n        model_outputs = pred_outputs[\"model_outputs\"]\n\n        pred_spec = model_outputs[0].data.cpu().numpy()\n        gt_spec = mel_input[0].data.cpu().numpy()\n        align_img = alignments[0].data.cpu().numpy()\n\n        figures = {\n            \"prediction\": plot_spectrogram(pred_spec, ap, output_fig=False),\n            \"ground_truth\": plot_spectrogram(gt_spec, ap, output_fig=False),\n            \"alignment\": plot_alignment(align_img, output_fig=False),\n        }\n\n        # Sample audio\n        train_audio = ap.inv_melspectrogram(pred_spec.T)\n        return figures, {\"audio\": train_audio}\n\n    def train_log(\n        self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int\n    ) -> None:  # pylint: disable=no-self-use\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.train_figures(steps, figures)\n        logger.train_audios(steps, audios, self.ap.sample_rate)\n\n    @torch.no_grad()\n    def eval_step(self, batch: dict, criterion: nn.Module):\n        return self.train_step(batch, criterion)\n\n    def eval_log(self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int) -> None:\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.eval_figures(steps, figures)\n        logger.eval_audios(steps, audios, self.ap.sample_rate)\n\n    @torch.no_grad()\n    def test_run(self, assets: Dict) -> Tuple[Dict, Dict]:\n        \"\"\"Generic test run for `tts` models used by `Trainer`.\n\n        You can override this for a different behaviour.\n\n        Returns:\n            Tuple[Dict, Dict]: Test figures and audios to be projected to Tensorboard.\n        \"\"\"\n        print(\" | > Synthesizing test sentences.\")\n        test_audios = {}\n        test_figures = {}\n        test_sentences = self.config.test_sentences\n        aux_inputs = self._get_test_aux_input()\n        if len(test_sentences) == 0:\n            print(\" | [!] No test sentences provided.\")\n        else:\n            for idx, sen in enumerate(test_sentences):\n                outputs = synthesis(\n                    self,\n                    sen,\n                    self.config,\n                    \"cuda\" in str(next(self.parameters()).device),\n                    speaker_id=aux_inputs[\"speaker_id\"],\n                    d_vector=aux_inputs[\"d_vector\"],\n                    style_wav=aux_inputs[\"style_wav\"],\n                    use_griffin_lim=True,\n                    do_trim_silence=False,\n                )\n\n                test_audios[\"{}-audio\".format(idx)] = outputs[\"wav\"]\n                test_figures[\"{}-prediction\".format(idx)] = plot_spectrogram(\n                    outputs[\"outputs\"][\"model_outputs\"], self.ap, output_fig=False\n                )\n                test_figures[\"{}-alignment\".format(idx)] = plot_alignment(outputs[\"alignments\"], output_fig=False)\n        return test_figures, test_audios\n\n    def preprocess(self, y, y_lengths, y_max_length, attn=None):\n        if y_max_length is not None:\n            y_max_length = (y_max_length // self.num_squeeze) * self.num_squeeze\n            y = y[:, :, :y_max_length]\n            if attn is not None:\n                attn = attn[:, :, :, :y_max_length]\n        y_lengths = torch.div(y_lengths, self.num_squeeze, rounding_mode=\"floor\") * self.num_squeeze\n        return y, y_lengths, y_max_length, attn\n\n    def store_inverse(self):\n        self.decoder.store_inverse()\n\n    def load_checkpoint(\n        self, config, checkpoint_path, eval=False\n    ):  # pylint: disable=unused-argument, redefined-builtin\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"))\n        self.load_state_dict(state[\"model\"])\n        if eval:\n            self.eval()\n            self.store_inverse()\n            assert not self.training\n\n    @staticmethod\n    def get_criterion():\n        from TTS.tts.layers.losses import GlowTTSLoss  # pylint: disable=import-outside-toplevel\n\n        return GlowTTSLoss()\n\n    def on_train_step_start(self, trainer):\n        \"\"\"Decide on every training step wheter enable/disable data depended initialization.\"\"\"\n        self.run_data_dep_init = trainer.total_steps_done < self.data_dep_init_steps\n\n    @staticmethod\n    def init_from_config(config: \"GlowTTSConfig\", samples: Union[List[List], List[Dict]] = None, verbose=True):\n        \"\"\"Initiate model from config\n\n        Args:\n            config (VitsConfig): Model config.\n            samples (Union[List[List], List[Dict]]): Training samples to parse speaker ids for training.\n                Defaults to None.\n            verbose (bool): If True, print init messages. Defaults to True.\n        \"\"\"\n        from TTS.utils.audio import AudioProcessor\n\n        ap = AudioProcessor.init_from_config(config, verbose)\n        tokenizer, new_config = TTSTokenizer.init_from_config(config)\n        speaker_manager = SpeakerManager.init_from_config(config, samples)\n        return GlowTTS(new_config, ap, tokenizer, speaker_manager)\n"
  },
  {
    "path": "TTS/tts/models/neuralhmm_tts.py",
    "content": "import os\nfrom typing import Dict, List, Union\n\nimport torch\nfrom coqpit import Coqpit\nfrom torch import nn\nfrom trainer.logging.tensorboard_logger import TensorboardLogger\n\nfrom TTS.tts.layers.overflow.common_layers import Encoder, OverflowUtils\nfrom TTS.tts.layers.overflow.neural_hmm import NeuralHMM\nfrom TTS.tts.layers.overflow.plotting_utils import (\n    get_spec_from_most_probable_state,\n    plot_transition_probabilities_to_numpy,\n)\nfrom TTS.tts.models.base_tts import BaseTTS\nfrom TTS.tts.utils.speakers import SpeakerManager\nfrom TTS.tts.utils.text.tokenizer import TTSTokenizer\nfrom TTS.tts.utils.visual import plot_alignment, plot_spectrogram\nfrom TTS.utils.generic_utils import format_aux_input\nfrom TTS.utils.io import load_fsspec\n\n\nclass NeuralhmmTTS(BaseTTS):\n    \"\"\"Neural HMM TTS model.\n\n    Paper::\n        https://arxiv.org/abs/2108.13320\n\n    Paper abstract::\n        Neural sequence-to-sequence TTS has achieved significantly better output quality\n    than statistical speech synthesis using HMMs.However, neural TTS is generally not probabilistic\n    and uses non-monotonic attention. Attention failures increase training time and can make\n    synthesis babble incoherently. This paper describes how the old and new paradigms can be\n    combined to obtain the advantages of both worlds, by replacing attention in neural TTS with\n    an autoregressive left-right no-skip hidden Markov model defined by a neural network.\n    Based on this proposal, we modify Tacotron 2 to obtain an HMM-based neural TTS model with\n    monotonic alignment, trained to maximise the full sequence likelihood without approximation.\n    We also describe how to combine ideas from classical and contemporary TTS for best results.\n    The resulting example system is smaller and simpler than Tacotron 2, and learns to speak with\n    fewer iterations and less data, whilst achieving comparable naturalness prior to the post-net.\n    Our approach also allows easy control over speaking rate. Audio examples and code\n    are available at https://shivammehta25.github.io/Neural-HMM/ .\n\n    Note:\n        - This is a parameter efficient version of OverFlow (15.3M vs 28.6M). Since it has half the\n        number of parameters as OverFlow the synthesis output quality is suboptimal (but comparable to Tacotron2\n        without Postnet), but it learns to speak with even lesser amount of data and is still significantly faster\n        than other attention-based methods.\n\n        - Neural HMMs uses flat start initialization i.e it computes the means and std and transition probabilities\n        of the dataset and uses them to initialize the model. This benefits the model and helps with faster learning\n        If you change the dataset or want to regenerate the parameters change the `force_generate_statistics` and\n        `mel_statistics_parameter_path` accordingly.\n\n        - To enable multi-GPU training, set the `use_grad_checkpointing=False` in config.\n        This will significantly increase the memory usage.  This is because to compute\n        the actual data likelihood (not an approximation using MAS/Viterbi) we must use\n        all the states at the previous time step during the forward pass to decide the\n        probability distribution at the current step i.e the difference between the forward\n        algorithm and viterbi approximation.\n\n    Check :class:`TTS.tts.configs.neuralhmm_tts_config.NeuralhmmTTSConfig` for class arguments.\n    \"\"\"\n\n    def __init__(\n        self,\n        config: \"NeuralhmmTTSConfig\",\n        ap: \"AudioProcessor\" = None,\n        tokenizer: \"TTSTokenizer\" = None,\n        speaker_manager: SpeakerManager = None,\n    ):\n        super().__init__(config, ap, tokenizer, speaker_manager)\n\n        # pass all config fields to `self`\n        # for fewer code change\n        self.config = config\n        for key in config:\n            setattr(self, key, config[key])\n\n        self.encoder = Encoder(config.num_chars, config.state_per_phone, config.encoder_in_out_features)\n        self.neural_hmm = NeuralHMM(\n            frame_channels=self.out_channels,\n            ar_order=self.ar_order,\n            deterministic_transition=self.deterministic_transition,\n            encoder_dim=self.encoder_in_out_features,\n            prenet_type=self.prenet_type,\n            prenet_dim=self.prenet_dim,\n            prenet_n_layers=self.prenet_n_layers,\n            prenet_dropout=self.prenet_dropout,\n            prenet_dropout_at_inference=self.prenet_dropout_at_inference,\n            memory_rnn_dim=self.memory_rnn_dim,\n            outputnet_size=self.outputnet_size,\n            flat_start_params=self.flat_start_params,\n            std_floor=self.std_floor,\n            use_grad_checkpointing=self.use_grad_checkpointing,\n        )\n\n        self.register_buffer(\"mean\", torch.tensor(0))\n        self.register_buffer(\"std\", torch.tensor(1))\n\n    def update_mean_std(self, statistics_dict: Dict):\n        self.mean.data = torch.tensor(statistics_dict[\"mean\"])\n        self.std.data = torch.tensor(statistics_dict[\"std\"])\n\n    def preprocess_batch(self, text, text_len, mels, mel_len):\n        if self.mean.item() == 0 or self.std.item() == 1:\n            statistics_dict = torch.load(self.mel_statistics_parameter_path)\n            self.update_mean_std(statistics_dict)\n\n        mels = self.normalize(mels)\n        return text, text_len, mels, mel_len\n\n    def normalize(self, x):\n        return x.sub(self.mean).div(self.std)\n\n    def inverse_normalize(self, x):\n        return x.mul(self.std).add(self.mean)\n\n    def forward(self, text, text_len, mels, mel_len):\n        \"\"\"\n        Forward pass for training and computing the log likelihood of a given batch.\n\n        Shapes:\n            Shapes:\n            text: :math:`[B, T_in]`\n            text_len: :math:`[B]`\n            mels: :math:`[B, T_out, C]`\n            mel_len: :math:`[B]`\n        \"\"\"\n        text, text_len, mels, mel_len = self.preprocess_batch(text, text_len, mels, mel_len)\n        encoder_outputs, encoder_output_len = self.encoder(text, text_len)\n\n        log_probs, fwd_alignments, transition_vectors, means = self.neural_hmm(\n            encoder_outputs, encoder_output_len, mels.transpose(1, 2), mel_len\n        )\n\n        outputs = {\n            \"log_probs\": log_probs,\n            \"alignments\": fwd_alignments,\n            \"transition_vectors\": transition_vectors,\n            \"means\": means,\n        }\n\n        return outputs\n\n    @staticmethod\n    def _training_stats(batch):\n        stats = {}\n        stats[\"avg_text_length\"] = batch[\"text_lengths\"].float().mean()\n        stats[\"avg_spec_length\"] = batch[\"mel_lengths\"].float().mean()\n        stats[\"avg_text_batch_occupancy\"] = (batch[\"text_lengths\"].float() / batch[\"text_lengths\"].float().max()).mean()\n        stats[\"avg_spec_batch_occupancy\"] = (batch[\"mel_lengths\"].float() / batch[\"mel_lengths\"].float().max()).mean()\n        return stats\n\n    def train_step(self, batch: dict, criterion: nn.Module):\n        text_input = batch[\"text_input\"]\n        text_lengths = batch[\"text_lengths\"]\n        mel_input = batch[\"mel_input\"]\n        mel_lengths = batch[\"mel_lengths\"]\n\n        outputs = self.forward(\n            text=text_input,\n            text_len=text_lengths,\n            mels=mel_input,\n            mel_len=mel_lengths,\n        )\n        loss_dict = criterion(outputs[\"log_probs\"] / (mel_lengths.sum() + text_lengths.sum()))\n\n        # for printing useful statistics on terminal\n        loss_dict.update(self._training_stats(batch))\n        return outputs, loss_dict\n\n    def eval_step(self, batch: Dict, criterion: nn.Module):\n        return self.train_step(batch, criterion)\n\n    def _format_aux_input(self, aux_input: Dict, default_input_dict):\n        \"\"\"Set missing fields to their default value.\n\n        Args:\n            aux_inputs (Dict): Dictionary containing the auxiliary inputs.\n        \"\"\"\n        default_input_dict = default_input_dict.copy()\n        default_input_dict.update(\n            {\n                \"sampling_temp\": self.sampling_temp,\n                \"max_sampling_time\": self.max_sampling_time,\n                \"duration_threshold\": self.duration_threshold,\n            }\n        )\n        if aux_input:\n            return format_aux_input(default_input_dict, aux_input)\n        return default_input_dict\n\n    @torch.no_grad()\n    def inference(\n        self,\n        text: torch.Tensor,\n        aux_input={\"x_lengths\": None, \"sampling_temp\": None, \"max_sampling_time\": None, \"duration_threshold\": None},\n    ):  # pylint: disable=dangerous-default-value\n        \"\"\"Sampling from the model\n\n        Args:\n            text (torch.Tensor): :math:`[B, T_in]`\n            aux_inputs (_type_, optional): _description_. Defaults to None.\n\n        Returns:\n            outputs: Dictionary containing the following\n                - mel (torch.Tensor): :math:`[B, T_out, C]`\n                - hmm_outputs_len (torch.Tensor): :math:`[B]`\n                - state_travelled (List[List[int]]): List of lists containing the state travelled for each sample in the batch.\n                - input_parameters (list[torch.FloatTensor]): Input parameters to the neural HMM.\n                - output_parameters (list[torch.FloatTensor]): Output parameters to the neural HMM.\n        \"\"\"\n        default_input_dict = {\n            \"x_lengths\": torch.sum(text != 0, dim=1),\n        }\n        aux_input = self._format_aux_input(aux_input, default_input_dict)\n        encoder_outputs, encoder_output_len = self.encoder.inference(text, aux_input[\"x_lengths\"])\n        outputs = self.neural_hmm.inference(\n            encoder_outputs,\n            encoder_output_len,\n            sampling_temp=aux_input[\"sampling_temp\"],\n            max_sampling_time=aux_input[\"max_sampling_time\"],\n            duration_threshold=aux_input[\"duration_threshold\"],\n        )\n        mels, mel_outputs_len = outputs[\"hmm_outputs\"], outputs[\"hmm_outputs_len\"]\n\n        mels = self.inverse_normalize(mels)\n        outputs.update({\"model_outputs\": mels, \"model_outputs_len\": mel_outputs_len})\n        outputs[\"alignments\"] = OverflowUtils.double_pad(outputs[\"alignments\"])\n        return outputs\n\n    @staticmethod\n    def get_criterion():\n        return NLLLoss()\n\n    @staticmethod\n    def init_from_config(config: \"NeuralhmmTTSConfig\", samples: Union[List[List], List[Dict]] = None, verbose=True):\n        \"\"\"Initiate model from config\n\n        Args:\n            config (VitsConfig): Model config.\n            samples (Union[List[List], List[Dict]]): Training samples to parse speaker ids for training.\n                Defaults to None.\n            verbose (bool): If True, print init messages. Defaults to True.\n        \"\"\"\n        from TTS.utils.audio import AudioProcessor\n\n        ap = AudioProcessor.init_from_config(config, verbose)\n        tokenizer, new_config = TTSTokenizer.init_from_config(config)\n        speaker_manager = SpeakerManager.init_from_config(config, samples)\n        return NeuralhmmTTS(new_config, ap, tokenizer, speaker_manager)\n\n    def load_checkpoint(\n        self, config: Coqpit, checkpoint_path: str, eval: bool = False, strict: bool = True, cache=False\n    ):  # pylint: disable=unused-argument, redefined-builtin\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"))\n        self.load_state_dict(state[\"model\"])\n        if eval:\n            self.eval()\n            assert not self.training\n\n    def on_init_start(self, trainer):\n        \"\"\"If the current dataset does not have normalisation statistics and initialisation transition_probability it computes them otherwise loads.\"\"\"\n        if not os.path.isfile(trainer.config.mel_statistics_parameter_path) or trainer.config.force_generate_statistics:\n            dataloader = trainer.get_train_dataloader(\n                training_assets=None, samples=trainer.train_samples, verbose=False\n            )\n            print(\n                f\" | > Data parameters not found for: {trainer.config.mel_statistics_parameter_path}. Computing mel normalization parameters...\"\n            )\n            data_mean, data_std, init_transition_prob = OverflowUtils.get_data_parameters_for_flat_start(\n                dataloader, trainer.config.out_channels, trainer.config.state_per_phone\n            )\n            print(\n                f\" | > Saving data parameters to: {trainer.config.mel_statistics_parameter_path}: value: {data_mean, data_std, init_transition_prob}\"\n            )\n            statistics = {\n                \"mean\": data_mean.item(),\n                \"std\": data_std.item(),\n                \"init_transition_prob\": init_transition_prob.item(),\n            }\n            torch.save(statistics, trainer.config.mel_statistics_parameter_path)\n\n        else:\n            print(\n                f\" | > Data parameters found for: {trainer.config.mel_statistics_parameter_path}. Loading mel normalization parameters...\"\n            )\n            statistics = torch.load(trainer.config.mel_statistics_parameter_path)\n            data_mean, data_std, init_transition_prob = (\n                statistics[\"mean\"],\n                statistics[\"std\"],\n                statistics[\"init_transition_prob\"],\n            )\n            print(f\" | > Data parameters loaded with value: {data_mean, data_std, init_transition_prob}\")\n\n        trainer.config.flat_start_params[\"transition_p\"] = (\n            init_transition_prob.item() if torch.is_tensor(init_transition_prob) else init_transition_prob\n        )\n        OverflowUtils.update_flat_start_transition(trainer.model, init_transition_prob)\n        trainer.model.update_mean_std(statistics)\n\n    @torch.inference_mode()\n    def _create_logs(self, batch, outputs, ap):  # pylint: disable=no-self-use, unused-argument\n        alignments, transition_vectors = outputs[\"alignments\"], outputs[\"transition_vectors\"]\n        means = torch.stack(outputs[\"means\"], dim=1)\n\n        figures = {\n            \"alignment\": plot_alignment(alignments[0].exp(), title=\"Forward alignment\", fig_size=(20, 20)),\n            \"log_alignment\": plot_alignment(\n                alignments[0].exp(), title=\"Forward log alignment\", plot_log=True, fig_size=(20, 20)\n            ),\n            \"transition_vectors\": plot_alignment(transition_vectors[0], title=\"Transition vectors\", fig_size=(20, 20)),\n            \"mel_from_most_probable_state\": plot_spectrogram(\n                get_spec_from_most_probable_state(alignments[0], means[0]), fig_size=(12, 3)\n            ),\n            \"mel_target\": plot_spectrogram(batch[\"mel_input\"][0], fig_size=(12, 3)),\n        }\n\n        # sample one item from the batch -1 will give the smalles item\n        print(\" | > Synthesising audio from the model...\")\n        inference_output = self.inference(\n            batch[\"text_input\"][-1].unsqueeze(0), aux_input={\"x_lengths\": batch[\"text_lengths\"][-1].unsqueeze(0)}\n        )\n        figures[\"synthesised\"] = plot_spectrogram(inference_output[\"model_outputs\"][0], fig_size=(12, 3))\n\n        states = [p[1] for p in inference_output[\"input_parameters\"][0]]\n        transition_probability_synthesising = [p[2].cpu().numpy() for p in inference_output[\"output_parameters\"][0]]\n\n        for i in range((len(transition_probability_synthesising) // 200) + 1):\n            start = i * 200\n            end = (i + 1) * 200\n            figures[f\"synthesised_transition_probabilities/{i}\"] = plot_transition_probabilities_to_numpy(\n                states[start:end], transition_probability_synthesising[start:end]\n            )\n\n        audio = ap.inv_melspectrogram(inference_output[\"model_outputs\"][0].T.cpu().numpy())\n        return figures, {\"audios\": audio}\n\n    def train_log(\n        self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int\n    ):  # pylint: disable=unused-argument\n        \"\"\"Log training progress.\"\"\"\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.train_figures(steps, figures)\n        logger.train_audios(steps, audios, self.ap.sample_rate)\n\n    def eval_log(\n        self, batch: Dict, outputs: Dict, logger: \"Logger\", assets: Dict, steps: int\n    ):  # pylint: disable=unused-argument\n        \"\"\"Compute and log evaluation metrics.\"\"\"\n        # Plot model parameters histograms\n        if isinstance(logger, TensorboardLogger):\n            # I don't know if any other loggers supports this\n            for tag, value in self.named_parameters():\n                tag = tag.replace(\".\", \"/\")\n                logger.writer.add_histogram(tag, value.data.cpu().numpy(), steps)\n\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.eval_figures(steps, figures)\n        logger.eval_audios(steps, audios, self.ap.sample_rate)\n\n    def test_log(\n        self, outputs: dict, logger: \"Logger\", assets: dict, steps: int  # pylint: disable=unused-argument\n    ) -> None:\n        logger.test_audios(steps, outputs[1], self.ap.sample_rate)\n        logger.test_figures(steps, outputs[0])\n\n\nclass NLLLoss(nn.Module):\n    \"\"\"Negative log likelihood loss.\"\"\"\n\n    def forward(self, log_prob: torch.Tensor) -> dict:  # pylint: disable=no-self-use\n        \"\"\"Compute the loss.\n\n        Args:\n            logits (Tensor): [B, T, D]\n\n        Returns:\n            Tensor: [1]\n\n        \"\"\"\n        return_dict = {}\n        return_dict[\"loss\"] = -log_prob.mean()\n        return return_dict\n"
  },
  {
    "path": "TTS/tts/models/overflow.py",
    "content": "import os\nfrom typing import Dict, List, Union\n\nimport torch\nfrom coqpit import Coqpit\nfrom torch import nn\nfrom trainer.logging.tensorboard_logger import TensorboardLogger\n\nfrom TTS.tts.layers.overflow.common_layers import Encoder, OverflowUtils\nfrom TTS.tts.layers.overflow.decoder import Decoder\nfrom TTS.tts.layers.overflow.neural_hmm import NeuralHMM\nfrom TTS.tts.layers.overflow.plotting_utils import (\n    get_spec_from_most_probable_state,\n    plot_transition_probabilities_to_numpy,\n)\nfrom TTS.tts.models.base_tts import BaseTTS\nfrom TTS.tts.utils.speakers import SpeakerManager\nfrom TTS.tts.utils.text.tokenizer import TTSTokenizer\nfrom TTS.tts.utils.visual import plot_alignment, plot_spectrogram\nfrom TTS.utils.generic_utils import format_aux_input\nfrom TTS.utils.io import load_fsspec\n\n\nclass Overflow(BaseTTS):\n    \"\"\"OverFlow TTS model.\n\n    Paper::\n        https://arxiv.org/abs/2211.06892\n\n    Paper abstract::\n        Neural HMMs are a type of neural transducer recently proposed for\n    sequence-to-sequence modelling in text-to-speech. They combine the best features\n    of classic statistical speech synthesis and modern neural TTS, requiring less\n    data and fewer training updates, and are less prone to gibberish output caused\n    by neural attention failures. In this paper, we combine neural HMM TTS with\n    normalising flows for describing the highly non-Gaussian distribution of speech\n    acoustics. The result is a powerful, fully probabilistic model of durations and\n    acoustics that can be trained using exact maximum likelihood. Compared to\n    dominant flow-based acoustic models, our approach integrates autoregression for\n    improved modelling of long-range dependences such as utterance-level prosody.\n    Experiments show that a system based on our proposal gives more accurate\n    pronunciations and better subjective speech quality than comparable methods,\n    whilst retaining the original advantages of neural HMMs. Audio examples and code\n    are available at https://shivammehta25.github.io/OverFlow/.\n\n    Note:\n        - Neural HMMs uses flat start initialization i.e it computes the means and std and transition probabilities\n        of the dataset and uses them to initialize the model. This benefits the model and helps with faster learning\n        If you change the dataset or want to regenerate the parameters change the `force_generate_statistics` and\n        `mel_statistics_parameter_path` accordingly.\n\n        - To enable multi-GPU training, set the `use_grad_checkpointing=False` in config.\n        This will significantly increase the memory usage.  This is because to compute\n        the actual data likelihood (not an approximation using MAS/Viterbi) we must use\n        all the states at the previous time step during the forward pass to decide the\n        probability distribution at the current step i.e the difference between the forward\n        algorithm and viterbi approximation.\n\n    Check :class:`TTS.tts.configs.overflow.OverFlowConfig` for class arguments.\n    \"\"\"\n\n    def __init__(\n        self,\n        config: \"OverFlowConfig\",\n        ap: \"AudioProcessor\" = None,\n        tokenizer: \"TTSTokenizer\" = None,\n        speaker_manager: SpeakerManager = None,\n    ):\n        super().__init__(config, ap, tokenizer, speaker_manager)\n\n        # pass all config fields to `self`\n        # for fewer code change\n        self.config = config\n        for key in config:\n            setattr(self, key, config[key])\n\n        self.decoder_output_dim = config.out_channels\n\n        self.encoder = Encoder(config.num_chars, config.state_per_phone, config.encoder_in_out_features)\n        self.neural_hmm = NeuralHMM(\n            frame_channels=self.out_channels,\n            ar_order=self.ar_order,\n            deterministic_transition=self.deterministic_transition,\n            encoder_dim=self.encoder_in_out_features,\n            prenet_type=self.prenet_type,\n            prenet_dim=self.prenet_dim,\n            prenet_n_layers=self.prenet_n_layers,\n            prenet_dropout=self.prenet_dropout,\n            prenet_dropout_at_inference=self.prenet_dropout_at_inference,\n            memory_rnn_dim=self.memory_rnn_dim,\n            outputnet_size=self.outputnet_size,\n            flat_start_params=self.flat_start_params,\n            std_floor=self.std_floor,\n            use_grad_checkpointing=self.use_grad_checkpointing,\n        )\n\n        self.decoder = Decoder(\n            self.out_channels,\n            self.hidden_channels_dec,\n            self.kernel_size_dec,\n            self.dilation_rate,\n            self.num_flow_blocks_dec,\n            self.num_block_layers,\n            dropout_p=self.dropout_p_dec,\n            num_splits=self.num_splits,\n            num_squeeze=self.num_squeeze,\n            sigmoid_scale=self.sigmoid_scale,\n            c_in_channels=self.c_in_channels,\n        )\n\n        self.register_buffer(\"mean\", torch.tensor(0))\n        self.register_buffer(\"std\", torch.tensor(1))\n\n    def update_mean_std(self, statistics_dict: Dict):\n        self.mean.data = torch.tensor(statistics_dict[\"mean\"])\n        self.std.data = torch.tensor(statistics_dict[\"std\"])\n\n    def preprocess_batch(self, text, text_len, mels, mel_len):\n        if self.mean.item() == 0 or self.std.item() == 1:\n            statistics_dict = torch.load(self.mel_statistics_parameter_path)\n            self.update_mean_std(statistics_dict)\n\n        mels = self.normalize(mels)\n        return text, text_len, mels, mel_len\n\n    def normalize(self, x):\n        return x.sub(self.mean).div(self.std)\n\n    def inverse_normalize(self, x):\n        return x.mul(self.std).add(self.mean)\n\n    def forward(self, text, text_len, mels, mel_len):\n        \"\"\"\n        Forward pass for training and computing the log likelihood of a given batch.\n\n        Shapes:\n            Shapes:\n            text: :math:`[B, T_in]`\n            text_len: :math:`[B]`\n            mels: :math:`[B, T_out, C]`\n            mel_len: :math:`[B]`\n        \"\"\"\n        text, text_len, mels, mel_len = self.preprocess_batch(text, text_len, mels, mel_len)\n        encoder_outputs, encoder_output_len = self.encoder(text, text_len)\n        z, z_lengths, logdet = self.decoder(mels.transpose(1, 2), mel_len)\n        log_probs, fwd_alignments, transition_vectors, means = self.neural_hmm(\n            encoder_outputs, encoder_output_len, z, z_lengths\n        )\n\n        outputs = {\n            \"log_probs\": log_probs + logdet,\n            \"alignments\": fwd_alignments,\n            \"transition_vectors\": transition_vectors,\n            \"means\": means,\n        }\n\n        return outputs\n\n    @staticmethod\n    def _training_stats(batch):\n        stats = {}\n        stats[\"avg_text_length\"] = batch[\"text_lengths\"].float().mean()\n        stats[\"avg_spec_length\"] = batch[\"mel_lengths\"].float().mean()\n        stats[\"avg_text_batch_occupancy\"] = (batch[\"text_lengths\"].float() / batch[\"text_lengths\"].float().max()).mean()\n        stats[\"avg_spec_batch_occupancy\"] = (batch[\"mel_lengths\"].float() / batch[\"mel_lengths\"].float().max()).mean()\n        return stats\n\n    def train_step(self, batch: dict, criterion: nn.Module):\n        text_input = batch[\"text_input\"]\n        text_lengths = batch[\"text_lengths\"]\n        mel_input = batch[\"mel_input\"]\n        mel_lengths = batch[\"mel_lengths\"]\n\n        outputs = self.forward(\n            text=text_input,\n            text_len=text_lengths,\n            mels=mel_input,\n            mel_len=mel_lengths,\n        )\n        loss_dict = criterion(outputs[\"log_probs\"] / (mel_lengths.sum() + text_lengths.sum()))\n\n        # for printing useful statistics on terminal\n        loss_dict.update(self._training_stats(batch))\n        return outputs, loss_dict\n\n    def eval_step(self, batch: Dict, criterion: nn.Module):\n        return self.train_step(batch, criterion)\n\n    def _format_aux_input(self, aux_input: Dict, default_input_dict):\n        \"\"\"Set missing fields to their default value.\n\n        Args:\n            aux_inputs (Dict): Dictionary containing the auxiliary inputs.\n        \"\"\"\n        default_input_dict = default_input_dict.copy()\n        default_input_dict.update(\n            {\n                \"sampling_temp\": self.sampling_temp,\n                \"max_sampling_time\": self.max_sampling_time,\n                \"duration_threshold\": self.duration_threshold,\n            }\n        )\n        if aux_input:\n            return format_aux_input(default_input_dict, aux_input)\n        return default_input_dict\n\n    @torch.no_grad()\n    def inference(\n        self,\n        text: torch.Tensor,\n        aux_input={\"x_lengths\": None, \"sampling_temp\": None, \"max_sampling_time\": None, \"duration_threshold\": None},\n    ):  # pylint: disable=dangerous-default-value\n        \"\"\"Sampling from the model\n\n        Args:\n            text (torch.Tensor): :math:`[B, T_in]`\n            aux_inputs (_type_, optional): _description_. Defaults to None.\n\n        Returns:\n            outputs: Dictionary containing the following\n                - mel (torch.Tensor): :math:`[B, T_out, C]`\n                - hmm_outputs_len (torch.Tensor): :math:`[B]`\n                - state_travelled (List[List[int]]): List of lists containing the state travelled for each sample in the batch.\n                - input_parameters (list[torch.FloatTensor]): Input parameters to the neural HMM.\n                - output_parameters (list[torch.FloatTensor]): Output parameters to the neural HMM.\n        \"\"\"\n        default_input_dict = {\n            \"x_lengths\": torch.sum(text != 0, dim=1),\n        }\n        aux_input = self._format_aux_input(aux_input, default_input_dict)\n        encoder_outputs, encoder_output_len = self.encoder.inference(text, aux_input[\"x_lengths\"])\n        outputs = self.neural_hmm.inference(\n            encoder_outputs,\n            encoder_output_len,\n            sampling_temp=aux_input[\"sampling_temp\"],\n            max_sampling_time=aux_input[\"max_sampling_time\"],\n            duration_threshold=aux_input[\"duration_threshold\"],\n        )\n\n        mels, mel_outputs_len, _ = self.decoder(\n            outputs[\"hmm_outputs\"].transpose(1, 2), outputs[\"hmm_outputs_len\"], reverse=True\n        )\n        mels = self.inverse_normalize(mels.transpose(1, 2))\n        outputs.update({\"model_outputs\": mels, \"model_outputs_len\": mel_outputs_len})\n        outputs[\"alignments\"] = OverflowUtils.double_pad(outputs[\"alignments\"])\n        return outputs\n\n    @staticmethod\n    def get_criterion():\n        return NLLLoss()\n\n    @staticmethod\n    def init_from_config(config: \"OverFlowConfig\", samples: Union[List[List], List[Dict]] = None, verbose=True):\n        \"\"\"Initiate model from config\n\n        Args:\n            config (VitsConfig): Model config.\n            samples (Union[List[List], List[Dict]]): Training samples to parse speaker ids for training.\n                Defaults to None.\n            verbose (bool): If True, print init messages. Defaults to True.\n        \"\"\"\n        from TTS.utils.audio import AudioProcessor\n\n        ap = AudioProcessor.init_from_config(config, verbose)\n        tokenizer, new_config = TTSTokenizer.init_from_config(config)\n        speaker_manager = SpeakerManager.init_from_config(config, samples)\n        return Overflow(new_config, ap, tokenizer, speaker_manager)\n\n    def load_checkpoint(\n        self, config: Coqpit, checkpoint_path: str, eval: bool = False, strict: bool = True, cache=False\n    ):  # pylint: disable=unused-argument, redefined-builtin\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"))\n        self.load_state_dict(state[\"model\"])\n        if eval:\n            self.eval()\n            self.decoder.store_inverse()\n            assert not self.training\n\n    def on_init_start(self, trainer):\n        \"\"\"If the current dataset does not have normalisation statistics and initialisation transition_probability it computes them otherwise loads.\"\"\"\n        if not os.path.isfile(trainer.config.mel_statistics_parameter_path) or trainer.config.force_generate_statistics:\n            dataloader = trainer.get_train_dataloader(\n                training_assets=None, samples=trainer.train_samples, verbose=False\n            )\n            print(\n                f\" | > Data parameters not found for: {trainer.config.mel_statistics_parameter_path}. Computing mel normalization parameters...\"\n            )\n            data_mean, data_std, init_transition_prob = OverflowUtils.get_data_parameters_for_flat_start(\n                dataloader, trainer.config.out_channels, trainer.config.state_per_phone\n            )\n            print(\n                f\" | > Saving data parameters to: {trainer.config.mel_statistics_parameter_path}: value: {data_mean, data_std, init_transition_prob}\"\n            )\n            statistics = {\n                \"mean\": data_mean.item(),\n                \"std\": data_std.item(),\n                \"init_transition_prob\": init_transition_prob.item(),\n            }\n            torch.save(statistics, trainer.config.mel_statistics_parameter_path)\n\n        else:\n            print(\n                f\" | > Data parameters found for: {trainer.config.mel_statistics_parameter_path}. Loading mel normalization parameters...\"\n            )\n            statistics = torch.load(trainer.config.mel_statistics_parameter_path)\n            data_mean, data_std, init_transition_prob = (\n                statistics[\"mean\"],\n                statistics[\"std\"],\n                statistics[\"init_transition_prob\"],\n            )\n            print(f\" | > Data parameters loaded with value: {data_mean, data_std, init_transition_prob}\")\n\n        trainer.config.flat_start_params[\"transition_p\"] = (\n            init_transition_prob.item() if torch.is_tensor(init_transition_prob) else init_transition_prob\n        )\n        OverflowUtils.update_flat_start_transition(trainer.model, init_transition_prob)\n        trainer.model.update_mean_std(statistics)\n\n    @torch.inference_mode()\n    def _create_logs(self, batch, outputs, ap):  # pylint: disable=no-self-use, unused-argument\n        alignments, transition_vectors = outputs[\"alignments\"], outputs[\"transition_vectors\"]\n        means = torch.stack(outputs[\"means\"], dim=1)\n\n        figures = {\n            \"alignment\": plot_alignment(alignments[0].exp(), title=\"Forward alignment\", fig_size=(20, 20)),\n            \"log_alignment\": plot_alignment(\n                alignments[0].exp(), title=\"Forward log alignment\", plot_log=True, fig_size=(20, 20)\n            ),\n            \"transition_vectors\": plot_alignment(transition_vectors[0], title=\"Transition vectors\", fig_size=(20, 20)),\n            \"mel_from_most_probable_state\": plot_spectrogram(\n                get_spec_from_most_probable_state(alignments[0], means[0], self.decoder), fig_size=(12, 3)\n            ),\n            \"mel_target\": plot_spectrogram(batch[\"mel_input\"][0], fig_size=(12, 3)),\n        }\n\n        # sample one item from the batch -1 will give the smalles item\n        print(\" | > Synthesising audio from the model...\")\n        inference_output = self.inference(\n            batch[\"text_input\"][-1].unsqueeze(0), aux_input={\"x_lengths\": batch[\"text_lengths\"][-1].unsqueeze(0)}\n        )\n        figures[\"synthesised\"] = plot_spectrogram(inference_output[\"model_outputs\"][0], fig_size=(12, 3))\n\n        states = [p[1] for p in inference_output[\"input_parameters\"][0]]\n        transition_probability_synthesising = [p[2].cpu().numpy() for p in inference_output[\"output_parameters\"][0]]\n\n        for i in range((len(transition_probability_synthesising) // 200) + 1):\n            start = i * 200\n            end = (i + 1) * 200\n            figures[f\"synthesised_transition_probabilities/{i}\"] = plot_transition_probabilities_to_numpy(\n                states[start:end], transition_probability_synthesising[start:end]\n            )\n\n        audio = ap.inv_melspectrogram(inference_output[\"model_outputs\"][0].T.cpu().numpy())\n        return figures, {\"audios\": audio}\n\n    def train_log(\n        self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int\n    ):  # pylint: disable=unused-argument\n        \"\"\"Log training progress.\"\"\"\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.train_figures(steps, figures)\n        logger.train_audios(steps, audios, self.ap.sample_rate)\n\n    def eval_log(\n        self, batch: Dict, outputs: Dict, logger: \"Logger\", assets: Dict, steps: int\n    ):  # pylint: disable=unused-argument\n        \"\"\"Compute and log evaluation metrics.\"\"\"\n        # Plot model parameters histograms\n        if isinstance(logger, TensorboardLogger):\n            # I don't know if any other loggers supports this\n            for tag, value in self.named_parameters():\n                tag = tag.replace(\".\", \"/\")\n                logger.writer.add_histogram(tag, value.data.cpu().numpy(), steps)\n\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.eval_figures(steps, figures)\n        logger.eval_audios(steps, audios, self.ap.sample_rate)\n\n    def test_log(\n        self, outputs: dict, logger: \"Logger\", assets: dict, steps: int  # pylint: disable=unused-argument\n    ) -> None:\n        logger.test_audios(steps, outputs[1], self.ap.sample_rate)\n        logger.test_figures(steps, outputs[0])\n\n\nclass NLLLoss(nn.Module):\n    \"\"\"Negative log likelihood loss.\"\"\"\n\n    def forward(self, log_prob: torch.Tensor) -> dict:  # pylint: disable=no-self-use\n        \"\"\"Compute the loss.\n\n        Args:\n            logits (Tensor): [B, T, D]\n\n        Returns:\n            Tensor: [1]\n\n        \"\"\"\n        return_dict = {}\n        return_dict[\"loss\"] = -log_prob.mean()\n        return return_dict\n"
  },
  {
    "path": "TTS/tts/models/tacotron.py",
    "content": "# coding: utf-8\n\nfrom typing import Dict, List, Tuple, Union\n\nimport torch\nfrom torch import nn\nfrom torch.cuda.amp.autocast_mode import autocast\nfrom trainer.trainer_utils import get_optimizer, get_scheduler\n\nfrom TTS.tts.layers.tacotron.capacitron_layers import CapacitronVAE\nfrom TTS.tts.layers.tacotron.gst_layers import GST\nfrom TTS.tts.layers.tacotron.tacotron import Decoder, Encoder, PostCBHG\nfrom TTS.tts.models.base_tacotron import BaseTacotron\nfrom TTS.tts.utils.measures import alignment_diagonal_score\nfrom TTS.tts.utils.speakers import SpeakerManager\nfrom TTS.tts.utils.text.tokenizer import TTSTokenizer\nfrom TTS.tts.utils.visual import plot_alignment, plot_spectrogram\nfrom TTS.utils.capacitron_optimizer import CapacitronOptimizer\n\n\nclass Tacotron(BaseTacotron):\n    \"\"\"Tacotron as in https://arxiv.org/abs/1703.10135\n    It's an autoregressive encoder-attention-decoder-postnet architecture.\n    Check `TacotronConfig` for the arguments.\n\n    Args:\n        config (TacotronConfig): Configuration for the Tacotron model.\n        speaker_manager (SpeakerManager): Speaker manager to handle multi-speaker settings. Only use if the model is\n            a multi-speaker model. Defaults to None.\n    \"\"\"\n\n    def __init__(\n        self,\n        config: \"TacotronConfig\",\n        ap: \"AudioProcessor\" = None,\n        tokenizer: \"TTSTokenizer\" = None,\n        speaker_manager: SpeakerManager = None,\n    ):\n        super().__init__(config, ap, tokenizer, speaker_manager)\n\n        # pass all config fields to `self`\n        # for fewer code change\n        for key in config:\n            setattr(self, key, config[key])\n\n        # set speaker embedding channel size for determining `in_channels` for the connected layers.\n        # `init_multispeaker` needs to be called once more in training to initialize the speaker embedding layer based\n        # on the number of speakers infered from the dataset.\n        if self.use_speaker_embedding or self.use_d_vector_file:\n            self.init_multispeaker(config)\n            self.decoder_in_features += self.embedded_speaker_dim  # add speaker embedding dim\n\n        if self.use_gst:\n            self.decoder_in_features += self.gst.gst_embedding_dim\n\n        if self.use_capacitron_vae:\n            self.decoder_in_features += self.capacitron_vae.capacitron_VAE_embedding_dim\n\n        # embedding layer\n        self.embedding = nn.Embedding(self.num_chars, 256, padding_idx=0)\n        self.embedding.weight.data.normal_(0, 0.3)\n\n        # base model layers\n        self.encoder = Encoder(self.encoder_in_features)\n        self.decoder = Decoder(\n            self.decoder_in_features,\n            self.decoder_output_dim,\n            self.r,\n            self.memory_size,\n            self.attention_type,\n            self.windowing,\n            self.attention_norm,\n            self.prenet_type,\n            self.prenet_dropout,\n            self.use_forward_attn,\n            self.transition_agent,\n            self.forward_attn_mask,\n            self.location_attn,\n            self.attention_heads,\n            self.separate_stopnet,\n            self.max_decoder_steps,\n        )\n        self.postnet = PostCBHG(self.decoder_output_dim)\n        self.last_linear = nn.Linear(self.postnet.cbhg.gru_features * 2, self.out_channels)\n\n        # setup prenet dropout\n        self.decoder.prenet.dropout_at_inference = self.prenet_dropout_at_inference\n\n        # global style token layers\n        if self.gst and self.use_gst:\n            self.gst_layer = GST(\n                num_mel=self.decoder_output_dim,\n                num_heads=self.gst.gst_num_heads,\n                num_style_tokens=self.gst.gst_num_style_tokens,\n                gst_embedding_dim=self.gst.gst_embedding_dim,\n            )\n\n        # Capacitron layers\n        if self.capacitron_vae and self.use_capacitron_vae:\n            self.capacitron_vae_layer = CapacitronVAE(\n                num_mel=self.decoder_output_dim,\n                encoder_output_dim=self.encoder_in_features,\n                capacitron_VAE_embedding_dim=self.capacitron_vae.capacitron_VAE_embedding_dim,\n                speaker_embedding_dim=self.embedded_speaker_dim\n                if self.use_speaker_embedding and self.capacitron_vae.capacitron_use_speaker_embedding\n                else None,\n                text_summary_embedding_dim=self.capacitron_vae.capacitron_text_summary_embedding_dim\n                if self.capacitron_vae.capacitron_use_text_summary_embeddings\n                else None,\n            )\n\n        # backward pass decoder\n        if self.bidirectional_decoder:\n            self._init_backward_decoder()\n        # setup DDC\n        if self.double_decoder_consistency:\n            self.coarse_decoder = Decoder(\n                self.decoder_in_features,\n                self.decoder_output_dim,\n                self.ddc_r,\n                self.memory_size,\n                self.attention_type,\n                self.windowing,\n                self.attention_norm,\n                self.prenet_type,\n                self.prenet_dropout,\n                self.use_forward_attn,\n                self.transition_agent,\n                self.forward_attn_mask,\n                self.location_attn,\n                self.attention_heads,\n                self.separate_stopnet,\n                self.max_decoder_steps,\n            )\n\n    def forward(  # pylint: disable=dangerous-default-value\n        self, text, text_lengths, mel_specs=None, mel_lengths=None, aux_input={\"speaker_ids\": None, \"d_vectors\": None}\n    ):\n        \"\"\"\n        Shapes:\n            text: [B, T_in]\n            text_lengths: [B]\n            mel_specs: [B, T_out, C]\n            mel_lengths: [B]\n            aux_input: 'speaker_ids': [B, 1] and  'd_vectors':[B, C]\n        \"\"\"\n        aux_input = self._format_aux_input(aux_input)\n        outputs = {\"alignments_backward\": None, \"decoder_outputs_backward\": None}\n        inputs = self.embedding(text)\n        input_mask, output_mask = self.compute_masks(text_lengths, mel_lengths)\n        # B x T_in x encoder_in_features\n        encoder_outputs = self.encoder(inputs)\n        # sequence masking\n        encoder_outputs = encoder_outputs * input_mask.unsqueeze(2).expand_as(encoder_outputs)\n        # global style token\n        if self.gst and self.use_gst:\n            # B x gst_dim\n            encoder_outputs = self.compute_gst(encoder_outputs, mel_specs)\n        # speaker embedding\n        if self.use_speaker_embedding or self.use_d_vector_file:\n            if not self.use_d_vector_file:\n                # B x 1 x speaker_embed_dim\n                embedded_speakers = self.speaker_embedding(aux_input[\"speaker_ids\"])[:, None]\n            else:\n                # B x 1 x speaker_embed_dim\n                embedded_speakers = torch.unsqueeze(aux_input[\"d_vectors\"], 1)\n            encoder_outputs = self._concat_speaker_embedding(encoder_outputs, embedded_speakers)\n        # Capacitron\n        if self.capacitron_vae and self.use_capacitron_vae:\n            # B x capacitron_VAE_embedding_dim\n            encoder_outputs, *capacitron_vae_outputs = self.compute_capacitron_VAE_embedding(\n                encoder_outputs,\n                reference_mel_info=[mel_specs, mel_lengths],\n                text_info=[inputs, text_lengths]\n                if self.capacitron_vae.capacitron_use_text_summary_embeddings\n                else None,\n                speaker_embedding=embedded_speakers if self.capacitron_vae.capacitron_use_speaker_embedding else None,\n            )\n        else:\n            capacitron_vae_outputs = None\n        # decoder_outputs: B x decoder_in_features x T_out\n        # alignments: B x T_in x encoder_in_features\n        # stop_tokens: B x T_in\n        decoder_outputs, alignments, stop_tokens = self.decoder(encoder_outputs, mel_specs, input_mask)\n        # sequence masking\n        if output_mask is not None:\n            decoder_outputs = decoder_outputs * output_mask.unsqueeze(1).expand_as(decoder_outputs)\n        # B x T_out x decoder_in_features\n        postnet_outputs = self.postnet(decoder_outputs)\n        # sequence masking\n        if output_mask is not None:\n            postnet_outputs = postnet_outputs * output_mask.unsqueeze(2).expand_as(postnet_outputs)\n        # B x T_out x posnet_dim\n        postnet_outputs = self.last_linear(postnet_outputs)\n        # B x T_out x decoder_in_features\n        decoder_outputs = decoder_outputs.transpose(1, 2).contiguous()\n        if self.bidirectional_decoder:\n            decoder_outputs_backward, alignments_backward = self._backward_pass(mel_specs, encoder_outputs, input_mask)\n            outputs[\"alignments_backward\"] = alignments_backward\n            outputs[\"decoder_outputs_backward\"] = decoder_outputs_backward\n        if self.double_decoder_consistency:\n            decoder_outputs_backward, alignments_backward = self._coarse_decoder_pass(\n                mel_specs, encoder_outputs, alignments, input_mask\n            )\n            outputs[\"alignments_backward\"] = alignments_backward\n            outputs[\"decoder_outputs_backward\"] = decoder_outputs_backward\n        outputs.update(\n            {\n                \"model_outputs\": postnet_outputs,\n                \"decoder_outputs\": decoder_outputs,\n                \"alignments\": alignments,\n                \"stop_tokens\": stop_tokens,\n                \"capacitron_vae_outputs\": capacitron_vae_outputs,\n            }\n        )\n        return outputs\n\n    @torch.no_grad()\n    def inference(self, text_input, aux_input=None):\n        aux_input = self._format_aux_input(aux_input)\n        inputs = self.embedding(text_input)\n        encoder_outputs = self.encoder(inputs)\n        if self.gst and self.use_gst:\n            # B x gst_dim\n            encoder_outputs = self.compute_gst(encoder_outputs, aux_input[\"style_mel\"], aux_input[\"d_vectors\"])\n        if self.capacitron_vae and self.use_capacitron_vae:\n            if aux_input[\"style_text\"] is not None:\n                style_text_embedding = self.embedding(aux_input[\"style_text\"])\n                style_text_length = torch.tensor([style_text_embedding.size(1)], dtype=torch.int64).to(\n                    encoder_outputs.device\n                )  # pylint: disable=not-callable\n            reference_mel_length = (\n                torch.tensor([aux_input[\"style_mel\"].size(1)], dtype=torch.int64).to(encoder_outputs.device)\n                if aux_input[\"style_mel\"] is not None\n                else None\n            )  # pylint: disable=not-callable\n            # B x capacitron_VAE_embedding_dim\n            encoder_outputs, *_ = self.compute_capacitron_VAE_embedding(\n                encoder_outputs,\n                reference_mel_info=[aux_input[\"style_mel\"], reference_mel_length]\n                if aux_input[\"style_mel\"] is not None\n                else None,\n                text_info=[style_text_embedding, style_text_length] if aux_input[\"style_text\"] is not None else None,\n                speaker_embedding=aux_input[\"d_vectors\"]\n                if self.capacitron_vae.capacitron_use_speaker_embedding\n                else None,\n            )\n        if self.num_speakers > 1:\n            if not self.use_d_vector_file:\n                # B x 1 x speaker_embed_dim\n                embedded_speakers = self.speaker_embedding(aux_input[\"speaker_ids\"])\n                # reshape embedded_speakers\n                if embedded_speakers.ndim == 1:\n                    embedded_speakers = embedded_speakers[None, None, :]\n                elif embedded_speakers.ndim == 2:\n                    embedded_speakers = embedded_speakers[None, :]\n            else:\n                # B x 1 x speaker_embed_dim\n                embedded_speakers = torch.unsqueeze(aux_input[\"d_vectors\"], 1)\n            encoder_outputs = self._concat_speaker_embedding(encoder_outputs, embedded_speakers)\n        decoder_outputs, alignments, stop_tokens = self.decoder.inference(encoder_outputs)\n        postnet_outputs = self.postnet(decoder_outputs)\n        postnet_outputs = self.last_linear(postnet_outputs)\n        decoder_outputs = decoder_outputs.transpose(1, 2)\n        outputs = {\n            \"model_outputs\": postnet_outputs,\n            \"decoder_outputs\": decoder_outputs,\n            \"alignments\": alignments,\n            \"stop_tokens\": stop_tokens,\n        }\n        return outputs\n\n    def before_backward_pass(self, loss_dict, optimizer) -> None:\n        # Extracting custom training specific operations for capacitron\n        # from the trainer\n        if self.use_capacitron_vae:\n            loss_dict[\"capacitron_vae_beta_loss\"].backward()\n            optimizer.first_step()\n\n    def train_step(self, batch: Dict, criterion: torch.nn.Module) -> Tuple[Dict, Dict]:\n        \"\"\"Perform a single training step by fetching the right set of samples from the batch.\n\n        Args:\n            batch ([Dict]): A dictionary of input tensors.\n            criterion ([torch.nn.Module]): Callable criterion to compute model loss.\n        \"\"\"\n        text_input = batch[\"text_input\"]\n        text_lengths = batch[\"text_lengths\"]\n        mel_input = batch[\"mel_input\"]\n        mel_lengths = batch[\"mel_lengths\"]\n        linear_input = batch[\"linear_input\"]\n        stop_targets = batch[\"stop_targets\"]\n        stop_target_lengths = batch[\"stop_target_lengths\"]\n        speaker_ids = batch[\"speaker_ids\"]\n        d_vectors = batch[\"d_vectors\"]\n\n        aux_input = {\"speaker_ids\": speaker_ids, \"d_vectors\": d_vectors}\n        outputs = self.forward(text_input, text_lengths, mel_input, mel_lengths, aux_input)\n\n        # set the [alignment] lengths wrt reduction factor for guided attention\n        if mel_lengths.max() % self.decoder.r != 0:\n            alignment_lengths = (\n                mel_lengths + (self.decoder.r - (mel_lengths.max() % self.decoder.r))\n            ) // self.decoder.r\n        else:\n            alignment_lengths = mel_lengths // self.decoder.r\n\n        # compute loss\n        with autocast(enabled=False):  # use float32 for the criterion\n            loss_dict = criterion(\n                outputs[\"model_outputs\"].float(),\n                outputs[\"decoder_outputs\"].float(),\n                mel_input.float(),\n                linear_input.float(),\n                outputs[\"stop_tokens\"].float(),\n                stop_targets.float(),\n                stop_target_lengths,\n                outputs[\"capacitron_vae_outputs\"] if self.capacitron_vae else None,\n                mel_lengths,\n                None if outputs[\"decoder_outputs_backward\"] is None else outputs[\"decoder_outputs_backward\"].float(),\n                outputs[\"alignments\"].float(),\n                alignment_lengths,\n                None if outputs[\"alignments_backward\"] is None else outputs[\"alignments_backward\"].float(),\n                text_lengths,\n            )\n\n        # compute alignment error (the lower the better )\n        align_error = 1 - alignment_diagonal_score(outputs[\"alignments\"])\n        loss_dict[\"align_error\"] = align_error\n        return outputs, loss_dict\n\n    def get_optimizer(self) -> List:\n        if self.use_capacitron_vae:\n            return CapacitronOptimizer(self.config, self.named_parameters())\n        return get_optimizer(self.config.optimizer, self.config.optimizer_params, self.config.lr, self)\n\n    def get_scheduler(self, optimizer: object):\n        opt = optimizer.primary_optimizer if self.use_capacitron_vae else optimizer\n        return get_scheduler(self.config.lr_scheduler, self.config.lr_scheduler_params, opt)\n\n    def before_gradient_clipping(self):\n        if self.use_capacitron_vae:\n            # Capacitron model specific gradient clipping\n            model_params_to_clip = []\n            for name, param in self.named_parameters():\n                if param.requires_grad:\n                    if name != \"capacitron_vae_layer.beta\":\n                        model_params_to_clip.append(param)\n            torch.nn.utils.clip_grad_norm_(model_params_to_clip, self.capacitron_vae.capacitron_grad_clip)\n\n    def _create_logs(self, batch, outputs, ap):\n        postnet_outputs = outputs[\"model_outputs\"]\n        decoder_outputs = outputs[\"decoder_outputs\"]\n        alignments = outputs[\"alignments\"]\n        alignments_backward = outputs[\"alignments_backward\"]\n        mel_input = batch[\"mel_input\"]\n        linear_input = batch[\"linear_input\"]\n\n        pred_linear_spec = postnet_outputs[0].data.cpu().numpy()\n        pred_mel_spec = decoder_outputs[0].data.cpu().numpy()\n        gt_linear_spec = linear_input[0].data.cpu().numpy()\n        gt_mel_spec = mel_input[0].data.cpu().numpy()\n        align_img = alignments[0].data.cpu().numpy()\n\n        figures = {\n            \"pred_linear_spec\": plot_spectrogram(pred_linear_spec, ap, output_fig=False),\n            \"real_linear_spec\": plot_spectrogram(gt_linear_spec, ap, output_fig=False),\n            \"pred_mel_spec\": plot_spectrogram(pred_mel_spec, ap, output_fig=False),\n            \"real_mel_spec\": plot_spectrogram(gt_mel_spec, ap, output_fig=False),\n            \"alignment\": plot_alignment(align_img, output_fig=False),\n        }\n\n        if self.bidirectional_decoder or self.double_decoder_consistency:\n            figures[\"alignment_backward\"] = plot_alignment(alignments_backward[0].data.cpu().numpy(), output_fig=False)\n\n        # Sample audio\n        audio = ap.inv_spectrogram(pred_linear_spec.T)\n        return figures, {\"audio\": audio}\n\n    def train_log(\n        self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int\n    ) -> None:  # pylint: disable=no-self-use\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.train_figures(steps, figures)\n        logger.train_audios(steps, audios, self.ap.sample_rate)\n\n    def eval_step(self, batch: dict, criterion: nn.Module):\n        return self.train_step(batch, criterion)\n\n    def eval_log(self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int) -> None:\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.eval_figures(steps, figures)\n        logger.eval_audios(steps, audios, self.ap.sample_rate)\n\n    @staticmethod\n    def init_from_config(config: \"TacotronConfig\", samples: Union[List[List], List[Dict]] = None):\n        \"\"\"Initiate model from config\n\n        Args:\n            config (TacotronConfig): Model config.\n            samples (Union[List[List], List[Dict]]): Training samples to parse speaker ids for training.\n                Defaults to None.\n        \"\"\"\n        from TTS.utils.audio import AudioProcessor\n\n        ap = AudioProcessor.init_from_config(config)\n        tokenizer, new_config = TTSTokenizer.init_from_config(config)\n        speaker_manager = SpeakerManager.init_from_config(config, samples)\n        return Tacotron(new_config, ap, tokenizer, speaker_manager)\n"
  },
  {
    "path": "TTS/tts/models/tacotron2.py",
    "content": "# coding: utf-8\n\nfrom typing import Dict, List, Union\n\nimport torch\nfrom torch import nn\nfrom torch.cuda.amp.autocast_mode import autocast\nfrom trainer.trainer_utils import get_optimizer, get_scheduler\n\nfrom TTS.tts.layers.tacotron.capacitron_layers import CapacitronVAE\nfrom TTS.tts.layers.tacotron.gst_layers import GST\nfrom TTS.tts.layers.tacotron.tacotron2 import Decoder, Encoder, Postnet\nfrom TTS.tts.models.base_tacotron import BaseTacotron\nfrom TTS.tts.utils.measures import alignment_diagonal_score\nfrom TTS.tts.utils.speakers import SpeakerManager\nfrom TTS.tts.utils.text.tokenizer import TTSTokenizer\nfrom TTS.tts.utils.visual import plot_alignment, plot_spectrogram\nfrom TTS.utils.capacitron_optimizer import CapacitronOptimizer\n\n\nclass Tacotron2(BaseTacotron):\n    \"\"\"Tacotron2 model implementation inherited from :class:`TTS.tts.models.base_tacotron.BaseTacotron`.\n\n    Paper::\n        https://arxiv.org/abs/1712.05884\n\n    Paper abstract::\n        This paper describes Tacotron 2, a neural network architecture for speech synthesis directly from text.\n        The system is composed of a recurrent sequence-to-sequence feature prediction network that maps character\n        embeddings to mel-scale spectrograms, followed by a modified WaveNet model acting as a vocoder to synthesize\n        timedomain waveforms from those spectrograms. Our model achieves a mean opinion score (MOS) of 4.53 comparable\n        to a MOS of 4.58 for professionally recorded speech. To validate our design choices, we present ablation\n        studies of key components of our system and evaluate the impact of using mel spectrograms as the input to\n        WaveNet instead of linguistic, duration, and F0 features. We further demonstrate that using a compact acoustic\n        intermediate representation enables significant simplification of the WaveNet architecture.\n\n    Check :class:`TTS.tts.configs.tacotron2_config.Tacotron2Config` for model arguments.\n\n    Args:\n        config (TacotronConfig):\n            Configuration for the Tacotron2 model.\n        speaker_manager (SpeakerManager):\n            Speaker manager for multi-speaker training. Uuse only for multi-speaker training. Defaults to None.\n    \"\"\"\n\n    def __init__(\n        self,\n        config: \"Tacotron2Config\",\n        ap: \"AudioProcessor\" = None,\n        tokenizer: \"TTSTokenizer\" = None,\n        speaker_manager: SpeakerManager = None,\n    ):\n        super().__init__(config, ap, tokenizer, speaker_manager)\n\n        self.decoder_output_dim = config.out_channels\n\n        # pass all config fields to `self`\n        # for fewer code change\n        for key in config:\n            setattr(self, key, config[key])\n\n        # init multi-speaker layers\n        if self.use_speaker_embedding or self.use_d_vector_file:\n            self.init_multispeaker(config)\n            self.decoder_in_features += self.embedded_speaker_dim  # add speaker embedding dim\n\n        if self.use_gst:\n            self.decoder_in_features += self.gst.gst_embedding_dim\n\n        if self.use_capacitron_vae:\n            self.decoder_in_features += self.capacitron_vae.capacitron_VAE_embedding_dim\n\n        # embedding layer\n        self.embedding = nn.Embedding(self.num_chars, 512, padding_idx=0)\n\n        # base model layers\n        self.encoder = Encoder(self.encoder_in_features)\n\n        self.decoder = Decoder(\n            self.decoder_in_features,\n            self.decoder_output_dim,\n            self.r,\n            self.attention_type,\n            self.attention_win,\n            self.attention_norm,\n            self.prenet_type,\n            self.prenet_dropout,\n            self.use_forward_attn,\n            self.transition_agent,\n            self.forward_attn_mask,\n            self.location_attn,\n            self.attention_heads,\n            self.separate_stopnet,\n            self.max_decoder_steps,\n        )\n        self.postnet = Postnet(self.out_channels)\n\n        # setup prenet dropout\n        self.decoder.prenet.dropout_at_inference = self.prenet_dropout_at_inference\n\n        # global style token layers\n        if self.gst and self.use_gst:\n            self.gst_layer = GST(\n                num_mel=self.decoder_output_dim,\n                num_heads=self.gst.gst_num_heads,\n                num_style_tokens=self.gst.gst_num_style_tokens,\n                gst_embedding_dim=self.gst.gst_embedding_dim,\n            )\n\n        # Capacitron VAE Layers\n        if self.capacitron_vae and self.use_capacitron_vae:\n            self.capacitron_vae_layer = CapacitronVAE(\n                num_mel=self.decoder_output_dim,\n                encoder_output_dim=self.encoder_in_features,\n                capacitron_VAE_embedding_dim=self.capacitron_vae.capacitron_VAE_embedding_dim,\n                speaker_embedding_dim=self.embedded_speaker_dim\n                if self.capacitron_vae.capacitron_use_speaker_embedding\n                else None,\n                text_summary_embedding_dim=self.capacitron_vae.capacitron_text_summary_embedding_dim\n                if self.capacitron_vae.capacitron_use_text_summary_embeddings\n                else None,\n            )\n\n        # backward pass decoder\n        if self.bidirectional_decoder:\n            self._init_backward_decoder()\n        # setup DDC\n        if self.double_decoder_consistency:\n            self.coarse_decoder = Decoder(\n                self.decoder_in_features,\n                self.decoder_output_dim,\n                self.ddc_r,\n                self.attention_type,\n                self.attention_win,\n                self.attention_norm,\n                self.prenet_type,\n                self.prenet_dropout,\n                self.use_forward_attn,\n                self.transition_agent,\n                self.forward_attn_mask,\n                self.location_attn,\n                self.attention_heads,\n                self.separate_stopnet,\n                self.max_decoder_steps,\n            )\n\n    @staticmethod\n    def shape_outputs(mel_outputs, mel_outputs_postnet, alignments):\n        \"\"\"Final reshape of the model output tensors.\"\"\"\n        mel_outputs = mel_outputs.transpose(1, 2)\n        mel_outputs_postnet = mel_outputs_postnet.transpose(1, 2)\n        return mel_outputs, mel_outputs_postnet, alignments\n\n    def forward(  # pylint: disable=dangerous-default-value\n        self, text, text_lengths, mel_specs=None, mel_lengths=None, aux_input={\"speaker_ids\": None, \"d_vectors\": None}\n    ):\n        \"\"\"Forward pass for training with Teacher Forcing.\n\n        Shapes:\n            text: :math:`[B, T_in]`\n            text_lengths: :math:`[B]`\n            mel_specs: :math:`[B, T_out, C]`\n            mel_lengths: :math:`[B]`\n            aux_input: 'speaker_ids': :math:`[B, 1]` and  'd_vectors': :math:`[B, C]`\n        \"\"\"\n        aux_input = self._format_aux_input(aux_input)\n        outputs = {\"alignments_backward\": None, \"decoder_outputs_backward\": None}\n        # compute mask for padding\n        # B x T_in_max (boolean)\n        input_mask, output_mask = self.compute_masks(text_lengths, mel_lengths)\n        # B x D_embed x T_in_max\n        embedded_inputs = self.embedding(text).transpose(1, 2)\n        # B x T_in_max x D_en\n        encoder_outputs = self.encoder(embedded_inputs, text_lengths)\n        if self.gst and self.use_gst:\n            # B x gst_dim\n            encoder_outputs = self.compute_gst(encoder_outputs, mel_specs)\n\n        if self.use_speaker_embedding or self.use_d_vector_file:\n            if not self.use_d_vector_file:\n                # B x 1 x speaker_embed_dim\n                embedded_speakers = self.speaker_embedding(aux_input[\"speaker_ids\"])[:, None]\n            else:\n                # B x 1 x speaker_embed_dim\n                embedded_speakers = torch.unsqueeze(aux_input[\"d_vectors\"], 1)\n            encoder_outputs = self._concat_speaker_embedding(encoder_outputs, embedded_speakers)\n\n        # capacitron\n        if self.capacitron_vae and self.use_capacitron_vae:\n            # B x capacitron_VAE_embedding_dim\n            encoder_outputs, *capacitron_vae_outputs = self.compute_capacitron_VAE_embedding(\n                encoder_outputs,\n                reference_mel_info=[mel_specs, mel_lengths],\n                text_info=[embedded_inputs.transpose(1, 2), text_lengths]\n                if self.capacitron_vae.capacitron_use_text_summary_embeddings\n                else None,\n                speaker_embedding=embedded_speakers if self.capacitron_vae.capacitron_use_speaker_embedding else None,\n            )\n        else:\n            capacitron_vae_outputs = None\n\n        encoder_outputs = encoder_outputs * input_mask.unsqueeze(2).expand_as(encoder_outputs)\n\n        # B x mel_dim x T_out -- B x T_out//r x T_in -- B x T_out//r\n        decoder_outputs, alignments, stop_tokens = self.decoder(encoder_outputs, mel_specs, input_mask)\n        # sequence masking\n        if mel_lengths is not None:\n            decoder_outputs = decoder_outputs * output_mask.unsqueeze(1).expand_as(decoder_outputs)\n        # B x mel_dim x T_out\n        postnet_outputs = self.postnet(decoder_outputs)\n        postnet_outputs = decoder_outputs + postnet_outputs\n        # sequence masking\n        if output_mask is not None:\n            postnet_outputs = postnet_outputs * output_mask.unsqueeze(1).expand_as(postnet_outputs)\n        # B x T_out x mel_dim -- B x T_out x mel_dim -- B x T_out//r x T_in\n        decoder_outputs, postnet_outputs, alignments = self.shape_outputs(decoder_outputs, postnet_outputs, alignments)\n        if self.bidirectional_decoder:\n            decoder_outputs_backward, alignments_backward = self._backward_pass(mel_specs, encoder_outputs, input_mask)\n            outputs[\"alignments_backward\"] = alignments_backward\n            outputs[\"decoder_outputs_backward\"] = decoder_outputs_backward\n        if self.double_decoder_consistency:\n            decoder_outputs_backward, alignments_backward = self._coarse_decoder_pass(\n                mel_specs, encoder_outputs, alignments, input_mask\n            )\n            outputs[\"alignments_backward\"] = alignments_backward\n            outputs[\"decoder_outputs_backward\"] = decoder_outputs_backward\n        outputs.update(\n            {\n                \"model_outputs\": postnet_outputs,\n                \"decoder_outputs\": decoder_outputs,\n                \"alignments\": alignments,\n                \"stop_tokens\": stop_tokens,\n                \"capacitron_vae_outputs\": capacitron_vae_outputs,\n            }\n        )\n        return outputs\n\n    @torch.no_grad()\n    def inference(self, text, aux_input=None):\n        \"\"\"Forward pass for inference with no Teacher-Forcing.\n\n        Shapes:\n           text: :math:`[B, T_in]`\n           text_lengths: :math:`[B]`\n        \"\"\"\n        aux_input = self._format_aux_input(aux_input)\n        embedded_inputs = self.embedding(text).transpose(1, 2)\n        encoder_outputs = self.encoder.inference(embedded_inputs)\n\n        if self.gst and self.use_gst:\n            # B x gst_dim\n            encoder_outputs = self.compute_gst(encoder_outputs, aux_input[\"style_mel\"], aux_input[\"d_vectors\"])\n\n        if self.capacitron_vae and self.use_capacitron_vae:\n            if aux_input[\"style_text\"] is not None:\n                style_text_embedding = self.embedding(aux_input[\"style_text\"])\n                style_text_length = torch.tensor([style_text_embedding.size(1)], dtype=torch.int64).to(\n                    encoder_outputs.device\n                )  # pylint: disable=not-callable\n            reference_mel_length = (\n                torch.tensor([aux_input[\"style_mel\"].size(1)], dtype=torch.int64).to(encoder_outputs.device)\n                if aux_input[\"style_mel\"] is not None\n                else None\n            )  # pylint: disable=not-callable\n            # B x capacitron_VAE_embedding_dim\n            encoder_outputs, *_ = self.compute_capacitron_VAE_embedding(\n                encoder_outputs,\n                reference_mel_info=[aux_input[\"style_mel\"], reference_mel_length]\n                if aux_input[\"style_mel\"] is not None\n                else None,\n                text_info=[style_text_embedding, style_text_length] if aux_input[\"style_text\"] is not None else None,\n                speaker_embedding=aux_input[\"d_vectors\"]\n                if self.capacitron_vae.capacitron_use_speaker_embedding\n                else None,\n            )\n\n        if self.num_speakers > 1:\n            if not self.use_d_vector_file:\n                embedded_speakers = self.speaker_embedding(aux_input[\"speaker_ids\"])[None]\n                # reshape embedded_speakers\n                if embedded_speakers.ndim == 1:\n                    embedded_speakers = embedded_speakers[None, None, :]\n                elif embedded_speakers.ndim == 2:\n                    embedded_speakers = embedded_speakers[None, :]\n            else:\n                embedded_speakers = aux_input[\"d_vectors\"]\n\n            encoder_outputs = self._concat_speaker_embedding(encoder_outputs, embedded_speakers)\n\n        decoder_outputs, alignments, stop_tokens = self.decoder.inference(encoder_outputs)\n        postnet_outputs = self.postnet(decoder_outputs)\n        postnet_outputs = decoder_outputs + postnet_outputs\n        decoder_outputs, postnet_outputs, alignments = self.shape_outputs(decoder_outputs, postnet_outputs, alignments)\n        outputs = {\n            \"model_outputs\": postnet_outputs,\n            \"decoder_outputs\": decoder_outputs,\n            \"alignments\": alignments,\n            \"stop_tokens\": stop_tokens,\n        }\n        return outputs\n\n    def before_backward_pass(self, loss_dict, optimizer) -> None:\n        # Extracting custom training specific operations for capacitron\n        # from the trainer\n        if self.use_capacitron_vae:\n            loss_dict[\"capacitron_vae_beta_loss\"].backward()\n            optimizer.first_step()\n\n    def train_step(self, batch: Dict, criterion: torch.nn.Module):\n        \"\"\"A single training step. Forward pass and loss computation.\n\n        Args:\n            batch ([Dict]): A dictionary of input tensors.\n            criterion ([type]): Callable criterion to compute model loss.\n        \"\"\"\n        text_input = batch[\"text_input\"]\n        text_lengths = batch[\"text_lengths\"]\n        mel_input = batch[\"mel_input\"]\n        mel_lengths = batch[\"mel_lengths\"]\n        stop_targets = batch[\"stop_targets\"]\n        stop_target_lengths = batch[\"stop_target_lengths\"]\n        speaker_ids = batch[\"speaker_ids\"]\n        d_vectors = batch[\"d_vectors\"]\n\n        aux_input = {\"speaker_ids\": speaker_ids, \"d_vectors\": d_vectors}\n        outputs = self.forward(text_input, text_lengths, mel_input, mel_lengths, aux_input)\n\n        # set the [alignment] lengths wrt reduction factor for guided attention\n        if mel_lengths.max() % self.decoder.r != 0:\n            alignment_lengths = (\n                mel_lengths + (self.decoder.r - (mel_lengths.max() % self.decoder.r))\n            ) // self.decoder.r\n        else:\n            alignment_lengths = mel_lengths // self.decoder.r\n\n        # compute loss\n        with autocast(enabled=False):  # use float32 for the criterion\n            loss_dict = criterion(\n                outputs[\"model_outputs\"].float(),\n                outputs[\"decoder_outputs\"].float(),\n                mel_input.float(),\n                None,\n                outputs[\"stop_tokens\"].float(),\n                stop_targets.float(),\n                stop_target_lengths,\n                outputs[\"capacitron_vae_outputs\"] if self.capacitron_vae else None,\n                mel_lengths,\n                None if outputs[\"decoder_outputs_backward\"] is None else outputs[\"decoder_outputs_backward\"].float(),\n                outputs[\"alignments\"].float(),\n                alignment_lengths,\n                None if outputs[\"alignments_backward\"] is None else outputs[\"alignments_backward\"].float(),\n                text_lengths,\n            )\n\n        # compute alignment error (the lower the better )\n        align_error = 1 - alignment_diagonal_score(outputs[\"alignments\"])\n        loss_dict[\"align_error\"] = align_error\n        return outputs, loss_dict\n\n    def get_optimizer(self) -> List:\n        if self.use_capacitron_vae:\n            return CapacitronOptimizer(self.config, self.named_parameters())\n        return get_optimizer(self.config.optimizer, self.config.optimizer_params, self.config.lr, self)\n\n    def get_scheduler(self, optimizer: object):\n        opt = optimizer.primary_optimizer if self.use_capacitron_vae else optimizer\n        return get_scheduler(self.config.lr_scheduler, self.config.lr_scheduler_params, opt)\n\n    def before_gradient_clipping(self):\n        if self.use_capacitron_vae:\n            # Capacitron model specific gradient clipping\n            model_params_to_clip = []\n            for name, param in self.named_parameters():\n                if param.requires_grad:\n                    if name != \"capacitron_vae_layer.beta\":\n                        model_params_to_clip.append(param)\n            torch.nn.utils.clip_grad_norm_(model_params_to_clip, self.capacitron_vae.capacitron_grad_clip)\n\n    def _create_logs(self, batch, outputs, ap):\n        \"\"\"Create dashboard log information.\"\"\"\n        postnet_outputs = outputs[\"model_outputs\"]\n        alignments = outputs[\"alignments\"]\n        alignments_backward = outputs[\"alignments_backward\"]\n        mel_input = batch[\"mel_input\"]\n\n        pred_spec = postnet_outputs[0].data.cpu().numpy()\n        gt_spec = mel_input[0].data.cpu().numpy()\n        align_img = alignments[0].data.cpu().numpy()\n\n        figures = {\n            \"prediction\": plot_spectrogram(pred_spec, ap, output_fig=False),\n            \"ground_truth\": plot_spectrogram(gt_spec, ap, output_fig=False),\n            \"alignment\": plot_alignment(align_img, output_fig=False),\n        }\n\n        if self.bidirectional_decoder or self.double_decoder_consistency:\n            figures[\"alignment_backward\"] = plot_alignment(alignments_backward[0].data.cpu().numpy(), output_fig=False)\n\n        # Sample audio\n        audio = ap.inv_melspectrogram(pred_spec.T)\n        return figures, {\"audio\": audio}\n\n    def train_log(\n        self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int\n    ) -> None:  # pylint: disable=no-self-use\n        \"\"\"Log training progress.\"\"\"\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.train_figures(steps, figures)\n        logger.train_audios(steps, audios, self.ap.sample_rate)\n\n    def eval_step(self, batch: dict, criterion: nn.Module):\n        return self.train_step(batch, criterion)\n\n    def eval_log(self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int) -> None:\n        figures, audios = self._create_logs(batch, outputs, self.ap)\n        logger.eval_figures(steps, figures)\n        logger.eval_audios(steps, audios, self.ap.sample_rate)\n\n    @staticmethod\n    def init_from_config(config: \"Tacotron2Config\", samples: Union[List[List], List[Dict]] = None):\n        \"\"\"Initiate model from config\n\n        Args:\n            config (Tacotron2Config): Model config.\n            samples (Union[List[List], List[Dict]]): Training samples to parse speaker ids for training.\n                Defaults to None.\n        \"\"\"\n        from TTS.utils.audio import AudioProcessor\n\n        ap = AudioProcessor.init_from_config(config)\n        tokenizer, new_config = TTSTokenizer.init_from_config(config)\n        speaker_manager = SpeakerManager.init_from_config(new_config, samples)\n        return Tacotron2(new_config, ap, tokenizer, speaker_manager)\n"
  },
  {
    "path": "TTS/tts/models/vits.py",
    "content": "import math\nimport os\nfrom dataclasses import dataclass, field, replace\nfrom itertools import chain\nfrom typing import Dict, List, Tuple, Union\n\nimport numpy as np\nimport torch\nimport torch.distributed as dist\nimport torchaudio\nfrom coqpit import Coqpit\nfrom librosa.filters import mel as librosa_mel_fn\nfrom torch import nn\nfrom torch.cuda.amp.autocast_mode import autocast\nfrom torch.nn import functional as F\nfrom torch.utils.data import DataLoader\nfrom torch.utils.data.sampler import WeightedRandomSampler\nfrom trainer.torch import DistributedSampler, DistributedSamplerWrapper\nfrom trainer.trainer_utils import get_optimizer, get_scheduler\n\nfrom TTS.tts.configs.shared_configs import CharactersConfig\nfrom TTS.tts.datasets.dataset import TTSDataset, _parse_sample\nfrom TTS.tts.layers.glow_tts.duration_predictor import DurationPredictor\nfrom TTS.tts.layers.vits.discriminator import VitsDiscriminator\nfrom TTS.tts.layers.vits.networks import PosteriorEncoder, ResidualCouplingBlocks, TextEncoder\nfrom TTS.tts.layers.vits.stochastic_duration_predictor import StochasticDurationPredictor\nfrom TTS.tts.models.base_tts import BaseTTS\nfrom TTS.tts.utils.helpers import generate_path, maximum_path, rand_segments, segment, sequence_mask\nfrom TTS.tts.utils.languages import LanguageManager\nfrom TTS.tts.utils.speakers import SpeakerManager\nfrom TTS.tts.utils.synthesis import synthesis\nfrom TTS.tts.utils.text.characters import BaseCharacters, _characters, _pad, _phonemes, _punctuations\nfrom TTS.tts.utils.text.tokenizer import TTSTokenizer\nfrom TTS.tts.utils.visual import plot_alignment\nfrom TTS.utils.io import load_fsspec\nfrom TTS.utils.samplers import BucketBatchSampler\nfrom TTS.vocoder.models.hifigan_generator import HifiganGenerator\nfrom TTS.vocoder.utils.generic_utils import plot_results\n\n##############################\n# IO / Feature extraction\n##############################\n\n# pylint: disable=global-statement\nhann_window = {}\nmel_basis = {}\n\n\n@torch.no_grad()\ndef weights_reset(m: nn.Module):\n    # check if the current module has reset_parameters and if it is reset the weight\n    reset_parameters = getattr(m, \"reset_parameters\", None)\n    if callable(reset_parameters):\n        m.reset_parameters()\n\n\ndef get_module_weights_sum(mdl: nn.Module):\n    dict_sums = {}\n    for name, w in mdl.named_parameters():\n        if \"weight\" in name:\n            value = w.data.sum().item()\n            dict_sums[name] = value\n    return dict_sums\n\n\ndef load_audio(file_path):\n    \"\"\"Load the audio file normalized in [-1, 1]\n\n    Return Shapes:\n        - x: :math:`[1, T]`\n    \"\"\"\n    x, sr = torchaudio.load(file_path)\n    assert (x > 1).sum() + (x < -1).sum() == 0\n    return x, sr\n\n\ndef _amp_to_db(x, C=1, clip_val=1e-5):\n    return torch.log(torch.clamp(x, min=clip_val) * C)\n\n\ndef _db_to_amp(x, C=1):\n    return torch.exp(x) / C\n\n\ndef amp_to_db(magnitudes):\n    output = _amp_to_db(magnitudes)\n    return output\n\n\ndef db_to_amp(magnitudes):\n    output = _db_to_amp(magnitudes)\n    return output\n\n\ndef wav_to_spec(y, n_fft, hop_length, win_length, center=False):\n    \"\"\"\n    Args Shapes:\n        - y : :math:`[B, 1, T]`\n\n    Return Shapes:\n        - spec : :math:`[B,C,T]`\n    \"\"\"\n    y = y.squeeze(1)\n\n    if torch.min(y) < -1.0:\n        print(\"min value is \", torch.min(y))\n    if torch.max(y) > 1.0:\n        print(\"max value is \", torch.max(y))\n\n    global hann_window\n    dtype_device = str(y.dtype) + \"_\" + str(y.device)\n    wnsize_dtype_device = str(win_length) + \"_\" + dtype_device\n    if wnsize_dtype_device not in hann_window:\n        hann_window[wnsize_dtype_device] = torch.hann_window(win_length).to(dtype=y.dtype, device=y.device)\n\n    y = torch.nn.functional.pad(\n        y.unsqueeze(1),\n        (int((n_fft - hop_length) / 2), int((n_fft - hop_length) / 2)),\n        mode=\"reflect\",\n    )\n    y = y.squeeze(1)\n\n    spec = torch.stft(\n        y,\n        n_fft,\n        hop_length=hop_length,\n        win_length=win_length,\n        window=hann_window[wnsize_dtype_device],\n        center=center,\n        pad_mode=\"reflect\",\n        normalized=False,\n        onesided=True,\n        return_complex=False,\n    )\n\n    spec = torch.sqrt(spec.pow(2).sum(-1) + 1e-6)\n    return spec\n\n\ndef spec_to_mel(spec, n_fft, num_mels, sample_rate, fmin, fmax):\n    \"\"\"\n    Args Shapes:\n        - spec : :math:`[B,C,T]`\n\n    Return Shapes:\n        - mel : :math:`[B,C,T]`\n    \"\"\"\n    global mel_basis\n    dtype_device = str(spec.dtype) + \"_\" + str(spec.device)\n    fmax_dtype_device = str(fmax) + \"_\" + dtype_device\n    if fmax_dtype_device not in mel_basis:\n        mel = librosa_mel_fn(sample_rate, n_fft, num_mels, fmin, fmax)\n        mel_basis[fmax_dtype_device] = torch.from_numpy(mel).to(dtype=spec.dtype, device=spec.device)\n    mel = torch.matmul(mel_basis[fmax_dtype_device], spec)\n    mel = amp_to_db(mel)\n    return mel\n\n\ndef wav_to_mel(y, n_fft, num_mels, sample_rate, hop_length, win_length, fmin, fmax, center=False):\n    \"\"\"\n    Args Shapes:\n        - y : :math:`[B, 1, T]`\n\n    Return Shapes:\n        - spec : :math:`[B,C,T]`\n    \"\"\"\n    y = y.squeeze(1)\n\n    if torch.min(y) < -1.0:\n        print(\"min value is \", torch.min(y))\n    if torch.max(y) > 1.0:\n        print(\"max value is \", torch.max(y))\n\n    global mel_basis, hann_window\n    dtype_device = str(y.dtype) + \"_\" + str(y.device)\n    fmax_dtype_device = str(fmax) + \"_\" + dtype_device\n    wnsize_dtype_device = str(win_length) + \"_\" + dtype_device\n    if fmax_dtype_device not in mel_basis:\n        mel = librosa_mel_fn(sample_rate, n_fft, num_mels, fmin, fmax)\n        mel_basis[fmax_dtype_device] = torch.from_numpy(mel).to(dtype=y.dtype, device=y.device)\n    if wnsize_dtype_device not in hann_window:\n        hann_window[wnsize_dtype_device] = torch.hann_window(win_length).to(dtype=y.dtype, device=y.device)\n\n    y = torch.nn.functional.pad(\n        y.unsqueeze(1),\n        (int((n_fft - hop_length) / 2), int((n_fft - hop_length) / 2)),\n        mode=\"reflect\",\n    )\n    y = y.squeeze(1)\n\n    spec = torch.stft(\n        y,\n        n_fft,\n        hop_length=hop_length,\n        win_length=win_length,\n        window=hann_window[wnsize_dtype_device],\n        center=center,\n        pad_mode=\"reflect\",\n        normalized=False,\n        onesided=True,\n        return_complex=False,\n    )\n\n    spec = torch.sqrt(spec.pow(2).sum(-1) + 1e-6)\n    spec = torch.matmul(mel_basis[fmax_dtype_device], spec)\n    spec = amp_to_db(spec)\n    return spec\n\n\n#############################\n# CONFIGS\n#############################\n\n\n@dataclass\nclass VitsAudioConfig(Coqpit):\n    fft_size: int = 1024\n    sample_rate: int = 22050\n    win_length: int = 1024\n    hop_length: int = 256\n    num_mels: int = 80\n    mel_fmin: int = 0\n    mel_fmax: int = None\n\n\n##############################\n# DATASET\n##############################\n\n\ndef get_attribute_balancer_weights(items: list, attr_name: str, multi_dict: dict = None):\n    \"\"\"Create inverse frequency weights for balancing the dataset.\n    Use `multi_dict` to scale relative weights.\"\"\"\n    attr_names_samples = np.array([item[attr_name] for item in items])\n    unique_attr_names = np.unique(attr_names_samples).tolist()\n    attr_idx = [unique_attr_names.index(l) for l in attr_names_samples]\n    attr_count = np.array([len(np.where(attr_names_samples == l)[0]) for l in unique_attr_names])\n    weight_attr = 1.0 / attr_count\n    dataset_samples_weight = np.array([weight_attr[l] for l in attr_idx])\n    dataset_samples_weight = dataset_samples_weight / np.linalg.norm(dataset_samples_weight)\n    if multi_dict is not None:\n        # check if all keys are in the multi_dict\n        for k in multi_dict:\n            assert k in unique_attr_names, f\"{k} not in {unique_attr_names}\"\n        # scale weights\n        multiplier_samples = np.array([multi_dict.get(item[attr_name], 1.0) for item in items])\n        dataset_samples_weight *= multiplier_samples\n    return (\n        torch.from_numpy(dataset_samples_weight).float(),\n        unique_attr_names,\n        np.unique(dataset_samples_weight).tolist(),\n    )\n\n\nclass VitsDataset(TTSDataset):\n    def __init__(self, model_args, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.pad_id = self.tokenizer.characters.pad_id\n        self.model_args = model_args\n\n    def __getitem__(self, idx):\n        item = self.samples[idx]\n        raw_text = item[\"text\"]\n\n        wav, _ = load_audio(item[\"audio_file\"])\n        if self.model_args.encoder_sample_rate is not None:\n            if wav.size(1) % self.model_args.encoder_sample_rate != 0:\n                wav = wav[:, : -int(wav.size(1) % self.model_args.encoder_sample_rate)]\n\n        wav_filename = os.path.basename(item[\"audio_file\"])\n\n        token_ids = self.get_token_ids(idx, item[\"text\"])\n\n        # after phonemization the text length may change\n        # this is a shameful 🤭 hack to prevent longer phonemes\n        # TODO: find a better fix\n        if len(token_ids) > self.max_text_len or wav.shape[1] < self.min_audio_len:\n            self.rescue_item_idx += 1\n            return self.__getitem__(self.rescue_item_idx)\n\n        return {\n            \"raw_text\": raw_text,\n            \"token_ids\": token_ids,\n            \"token_len\": len(token_ids),\n            \"wav\": wav,\n            \"wav_file\": wav_filename,\n            \"speaker_name\": item[\"speaker_name\"],\n            \"language_name\": item[\"language\"],\n            \"audio_unique_name\": item[\"audio_unique_name\"],\n        }\n\n    @property\n    def lengths(self):\n        lens = []\n        for item in self.samples:\n            _, wav_file, *_ = _parse_sample(item)\n            audio_len = os.path.getsize(wav_file) / 16 * 8  # assuming 16bit audio\n            lens.append(audio_len)\n        return lens\n\n    def collate_fn(self, batch):\n        \"\"\"\n        Return Shapes:\n            - tokens: :math:`[B, T]`\n            - token_lens :math:`[B]`\n            - token_rel_lens :math:`[B]`\n            - waveform: :math:`[B, 1, T]`\n            - waveform_lens: :math:`[B]`\n            - waveform_rel_lens: :math:`[B]`\n            - speaker_names: :math:`[B]`\n            - language_names: :math:`[B]`\n            - audiofile_paths: :math:`[B]`\n            - raw_texts: :math:`[B]`\n            - audio_unique_names: :math:`[B]`\n        \"\"\"\n        # convert list of dicts to dict of lists\n        B = len(batch)\n        batch = {k: [dic[k] for dic in batch] for k in batch[0]}\n\n        _, ids_sorted_decreasing = torch.sort(\n            torch.LongTensor([x.size(1) for x in batch[\"wav\"]]), dim=0, descending=True\n        )\n\n        max_text_len = max([len(x) for x in batch[\"token_ids\"]])\n        token_lens = torch.LongTensor(batch[\"token_len\"])\n        token_rel_lens = token_lens / token_lens.max()\n\n        wav_lens = [w.shape[1] for w in batch[\"wav\"]]\n        wav_lens = torch.LongTensor(wav_lens)\n        wav_lens_max = torch.max(wav_lens)\n        wav_rel_lens = wav_lens / wav_lens_max\n\n        token_padded = torch.LongTensor(B, max_text_len)\n        wav_padded = torch.FloatTensor(B, 1, wav_lens_max)\n        token_padded = token_padded.zero_() + self.pad_id\n        wav_padded = wav_padded.zero_() + self.pad_id\n        for i in range(len(ids_sorted_decreasing)):\n            token_ids = batch[\"token_ids\"][i]\n            token_padded[i, : batch[\"token_len\"][i]] = torch.LongTensor(token_ids)\n\n            wav = batch[\"wav\"][i]\n            wav_padded[i, :, : wav.size(1)] = torch.FloatTensor(wav)\n\n        return {\n            \"tokens\": token_padded,\n            \"token_lens\": token_lens,\n            \"token_rel_lens\": token_rel_lens,\n            \"waveform\": wav_padded,  # (B x T)\n            \"waveform_lens\": wav_lens,  # (B)\n            \"waveform_rel_lens\": wav_rel_lens,\n            \"speaker_names\": batch[\"speaker_name\"],\n            \"language_names\": batch[\"language_name\"],\n            \"audio_files\": batch[\"wav_file\"],\n            \"raw_text\": batch[\"raw_text\"],\n            \"audio_unique_names\": batch[\"audio_unique_name\"],\n        }\n\n\n##############################\n# MODEL DEFINITION\n##############################\n\n\n@dataclass\nclass VitsArgs(Coqpit):\n    \"\"\"VITS model arguments.\n\n    Args:\n\n        num_chars (int):\n            Number of characters in the vocabulary. Defaults to 100.\n\n        out_channels (int):\n            Number of output channels of the decoder. Defaults to 513.\n\n        spec_segment_size (int):\n            Decoder input segment size. Defaults to 32 `(32 * hoplength = waveform length)`.\n\n        hidden_channels (int):\n            Number of hidden channels of the model. Defaults to 192.\n\n        hidden_channels_ffn_text_encoder (int):\n            Number of hidden channels of the feed-forward layers of the text encoder transformer. Defaults to 256.\n\n        num_heads_text_encoder (int):\n            Number of attention heads of the text encoder transformer. Defaults to 2.\n\n        num_layers_text_encoder (int):\n            Number of transformer layers in the text encoder. Defaults to 6.\n\n        kernel_size_text_encoder (int):\n            Kernel size of the text encoder transformer FFN layers. Defaults to 3.\n\n        dropout_p_text_encoder (float):\n            Dropout rate of the text encoder. Defaults to 0.1.\n\n        dropout_p_duration_predictor (float):\n            Dropout rate of the duration predictor. Defaults to 0.1.\n\n        kernel_size_posterior_encoder (int):\n            Kernel size of the posterior encoder's WaveNet layers. Defaults to 5.\n\n        dilatation_posterior_encoder (int):\n            Dilation rate of the posterior encoder's WaveNet layers. Defaults to 1.\n\n        num_layers_posterior_encoder (int):\n            Number of posterior encoder's WaveNet layers. Defaults to 16.\n\n        kernel_size_flow (int):\n            Kernel size of the Residual Coupling layers of the flow network. Defaults to 5.\n\n        dilatation_flow (int):\n            Dilation rate of the Residual Coupling WaveNet layers of the flow network. Defaults to 1.\n\n        num_layers_flow (int):\n            Number of Residual Coupling WaveNet layers of the flow network. Defaults to 6.\n\n        resblock_type_decoder (str):\n            Type of the residual block in the decoder network. Defaults to \"1\".\n\n        resblock_kernel_sizes_decoder (List[int]):\n            Kernel sizes of the residual blocks in the decoder network. Defaults to `[3, 7, 11]`.\n\n        resblock_dilation_sizes_decoder (List[List[int]]):\n            Dilation sizes of the residual blocks in the decoder network. Defaults to `[[1, 3, 5], [1, 3, 5], [1, 3, 5]]`.\n\n        upsample_rates_decoder (List[int]):\n            Upsampling rates for each concecutive upsampling layer in the decoder network. The multiply of these\n            values must be equal to the kop length used for computing spectrograms. Defaults to `[8, 8, 2, 2]`.\n\n        upsample_initial_channel_decoder (int):\n            Number of hidden channels of the first upsampling convolution layer of the decoder network. Defaults to 512.\n\n        upsample_kernel_sizes_decoder (List[int]):\n            Kernel sizes for each upsampling layer of the decoder network. Defaults to `[16, 16, 4, 4]`.\n\n        periods_multi_period_discriminator (List[int]):\n            Periods values for Vits Multi-Period Discriminator. Defaults to `[2, 3, 5, 7, 11]`.\n\n        use_sdp (bool):\n            Use Stochastic Duration Predictor. Defaults to True.\n\n        noise_scale (float):\n            Noise scale used for the sample noise tensor in training. Defaults to 1.0.\n\n        inference_noise_scale (float):\n            Noise scale used for the sample noise tensor in inference. Defaults to 0.667.\n\n        length_scale (float):\n            Scale factor for the predicted duration values. Smaller values result faster speech. Defaults to 1.\n\n        noise_scale_dp (float):\n            Noise scale used by the Stochastic Duration Predictor sample noise in training. Defaults to 1.0.\n\n        inference_noise_scale_dp (float):\n            Noise scale for the Stochastic Duration Predictor in inference. Defaults to 0.8.\n\n        max_inference_len (int):\n            Maximum inference length to limit the memory use. Defaults to None.\n\n        init_discriminator (bool):\n            Initialize the disciminator network if set True. Set False for inference. Defaults to True.\n\n        use_spectral_norm_disriminator (bool):\n            Use spectral normalization over weight norm in the discriminator. Defaults to False.\n\n        use_speaker_embedding (bool):\n            Enable/Disable speaker embedding for multi-speaker models. Defaults to False.\n\n        num_speakers (int):\n            Number of speakers for the speaker embedding layer. Defaults to 0.\n\n        speakers_file (str):\n            Path to the speaker mapping file for the Speaker Manager. Defaults to None.\n\n        speaker_embedding_channels (int):\n            Number of speaker embedding channels. Defaults to 256.\n\n        use_d_vector_file (bool):\n            Enable/Disable the use of d-vectors for multi-speaker training. Defaults to False.\n\n        d_vector_file (List[str]):\n            List of paths to the files including pre-computed speaker embeddings. Defaults to None.\n\n        d_vector_dim (int):\n            Number of d-vector channels. Defaults to 0.\n\n        detach_dp_input (bool):\n            Detach duration predictor's input from the network for stopping the gradients. Defaults to True.\n\n        use_language_embedding (bool):\n            Enable/Disable language embedding for multilingual models. Defaults to False.\n\n        embedded_language_dim (int):\n            Number of language embedding channels. Defaults to 4.\n\n        num_languages (int):\n            Number of languages for the language embedding layer. Defaults to 0.\n\n        language_ids_file (str):\n            Path to the language mapping file for the Language Manager. Defaults to None.\n\n        use_speaker_encoder_as_loss (bool):\n            Enable/Disable Speaker Consistency Loss (SCL). Defaults to False.\n\n        speaker_encoder_config_path (str):\n            Path to the file speaker encoder config file, to use for SCL. Defaults to \"\".\n\n        speaker_encoder_model_path (str):\n            Path to the file speaker encoder checkpoint file, to use for SCL. Defaults to \"\".\n\n        condition_dp_on_speaker (bool):\n            Condition the duration predictor on the speaker embedding. Defaults to True.\n\n        freeze_encoder (bool):\n            Freeze the encoder weigths during training. Defaults to False.\n\n        freeze_DP (bool):\n            Freeze the duration predictor weigths during training. Defaults to False.\n\n        freeze_PE (bool):\n            Freeze the posterior encoder weigths during training. Defaults to False.\n\n        freeze_flow_encoder (bool):\n            Freeze the flow encoder weigths during training. Defaults to False.\n\n        freeze_waveform_decoder (bool):\n            Freeze the waveform decoder weigths during training. Defaults to False.\n\n        encoder_sample_rate (int):\n            If not None this sample rate will be used for training the Posterior Encoder,\n            flow, text_encoder and duration predictor. The decoder part (vocoder) will be\n            trained with the `config.audio.sample_rate`. Defaults to None.\n\n        interpolate_z (bool):\n            If `encoder_sample_rate` not None and  this parameter True the nearest interpolation\n            will be used to upsampling the latent variable z with the sampling rate `encoder_sample_rate`\n            to the `config.audio.sample_rate`. If it is False you will need to add extra\n            `upsample_rates_decoder` to match the shape. Defaults to True.\n\n    \"\"\"\n\n    num_chars: int = 100\n    out_channels: int = 513\n    spec_segment_size: int = 32\n    hidden_channels: int = 192\n    hidden_channels_ffn_text_encoder: int = 768\n    num_heads_text_encoder: int = 2\n    num_layers_text_encoder: int = 6\n    kernel_size_text_encoder: int = 3\n    dropout_p_text_encoder: float = 0.1\n    dropout_p_duration_predictor: float = 0.5\n    kernel_size_posterior_encoder: int = 5\n    dilation_rate_posterior_encoder: int = 1\n    num_layers_posterior_encoder: int = 16\n    kernel_size_flow: int = 5\n    dilation_rate_flow: int = 1\n    num_layers_flow: int = 4\n    resblock_type_decoder: str = \"1\"\n    resblock_kernel_sizes_decoder: List[int] = field(default_factory=lambda: [3, 7, 11])\n    resblock_dilation_sizes_decoder: List[List[int]] = field(default_factory=lambda: [[1, 3, 5], [1, 3, 5], [1, 3, 5]])\n    upsample_rates_decoder: List[int] = field(default_factory=lambda: [8, 8, 2, 2])\n    upsample_initial_channel_decoder: int = 512\n    upsample_kernel_sizes_decoder: List[int] = field(default_factory=lambda: [16, 16, 4, 4])\n    periods_multi_period_discriminator: List[int] = field(default_factory=lambda: [2, 3, 5, 7, 11])\n    use_sdp: bool = True\n    noise_scale: float = 1.0\n    inference_noise_scale: float = 0.667\n    length_scale: float = 1\n    noise_scale_dp: float = 1.0\n    inference_noise_scale_dp: float = 1.0\n    max_inference_len: int = None\n    init_discriminator: bool = True\n    use_spectral_norm_disriminator: bool = False\n    use_speaker_embedding: bool = False\n    num_speakers: int = 0\n    speakers_file: str = None\n    d_vector_file: List[str] = None\n    speaker_embedding_channels: int = 256\n    use_d_vector_file: bool = False\n    d_vector_dim: int = 0\n    detach_dp_input: bool = True\n    use_language_embedding: bool = False\n    embedded_language_dim: int = 4\n    num_languages: int = 0\n    language_ids_file: str = None\n    use_speaker_encoder_as_loss: bool = False\n    speaker_encoder_config_path: str = \"\"\n    speaker_encoder_model_path: str = \"\"\n    condition_dp_on_speaker: bool = True\n    freeze_encoder: bool = False\n    freeze_DP: bool = False\n    freeze_PE: bool = False\n    freeze_flow_decoder: bool = False\n    freeze_waveform_decoder: bool = False\n    encoder_sample_rate: int = None\n    interpolate_z: bool = True\n    reinit_DP: bool = False\n    reinit_text_encoder: bool = False\n\n\nclass Vits(BaseTTS):\n    \"\"\"VITS TTS model\n\n    Paper::\n        https://arxiv.org/pdf/2106.06103.pdf\n\n    Paper Abstract::\n        Several recent end-to-end text-to-speech (TTS) models enabling single-stage training and parallel\n        sampling have been proposed, but their sample quality does not match that of two-stage TTS systems.\n        In this work, we present a parallel endto-end TTS method that generates more natural sounding audio than\n        current two-stage models. Our method adopts variational inference augmented with normalizing flows and\n        an adversarial training process, which improves the expressive power of generative modeling. We also propose a\n        stochastic duration predictor to synthesize speech with diverse rhythms from input text. With the\n        uncertainty modeling over latent variables and the stochastic duration predictor, our method expresses the\n        natural one-to-many relationship in which a text input can be spoken in multiple ways\n        with different pitches and rhythms. A subjective human evaluation (mean opinion score, or MOS)\n        on the LJ Speech, a single speaker dataset, shows that our method outperforms the best publicly\n        available TTS systems and achieves a MOS comparable to ground truth.\n\n    Check :class:`TTS.tts.configs.vits_config.VitsConfig` for class arguments.\n\n    Examples:\n        >>> from TTS.tts.configs.vits_config import VitsConfig\n        >>> from TTS.tts.models.vits import Vits\n        >>> config = VitsConfig()\n        >>> model = Vits(config)\n    \"\"\"\n\n    def __init__(\n        self,\n        config: Coqpit,\n        ap: \"AudioProcessor\" = None,\n        tokenizer: \"TTSTokenizer\" = None,\n        speaker_manager: SpeakerManager = None,\n        language_manager: LanguageManager = None,\n    ):\n        super().__init__(config, ap, tokenizer, speaker_manager, language_manager)\n\n        self.init_multispeaker(config)\n        self.init_multilingual(config)\n        self.init_upsampling()\n\n        self.length_scale = self.args.length_scale\n        self.noise_scale = self.args.noise_scale\n        self.inference_noise_scale = self.args.inference_noise_scale\n        self.inference_noise_scale_dp = self.args.inference_noise_scale_dp\n        self.noise_scale_dp = self.args.noise_scale_dp\n        self.max_inference_len = self.args.max_inference_len\n        self.spec_segment_size = self.args.spec_segment_size\n\n        self.text_encoder = TextEncoder(\n            self.args.num_chars,\n            self.args.hidden_channels,\n            self.args.hidden_channels,\n            self.args.hidden_channels_ffn_text_encoder,\n            self.args.num_heads_text_encoder,\n            self.args.num_layers_text_encoder,\n            self.args.kernel_size_text_encoder,\n            self.args.dropout_p_text_encoder,\n            language_emb_dim=self.embedded_language_dim,\n        )\n\n        self.posterior_encoder = PosteriorEncoder(\n            self.args.out_channels,\n            self.args.hidden_channels,\n            self.args.hidden_channels,\n            kernel_size=self.args.kernel_size_posterior_encoder,\n            dilation_rate=self.args.dilation_rate_posterior_encoder,\n            num_layers=self.args.num_layers_posterior_encoder,\n            cond_channels=self.embedded_speaker_dim,\n        )\n\n        self.flow = ResidualCouplingBlocks(\n            self.args.hidden_channels,\n            self.args.hidden_channels,\n            kernel_size=self.args.kernel_size_flow,\n            dilation_rate=self.args.dilation_rate_flow,\n            num_layers=self.args.num_layers_flow,\n            cond_channels=self.embedded_speaker_dim,\n        )\n\n        if self.args.use_sdp:\n            self.duration_predictor = StochasticDurationPredictor(\n                self.args.hidden_channels,\n                192,\n                3,\n                self.args.dropout_p_duration_predictor,\n                4,\n                cond_channels=self.embedded_speaker_dim if self.args.condition_dp_on_speaker else 0,\n                language_emb_dim=self.embedded_language_dim,\n            )\n        else:\n            self.duration_predictor = DurationPredictor(\n                self.args.hidden_channels,\n                256,\n                3,\n                self.args.dropout_p_duration_predictor,\n                cond_channels=self.embedded_speaker_dim,\n                language_emb_dim=self.embedded_language_dim,\n            )\n\n        self.waveform_decoder = HifiganGenerator(\n            self.args.hidden_channels,\n            1,\n            self.args.resblock_type_decoder,\n            self.args.resblock_dilation_sizes_decoder,\n            self.args.resblock_kernel_sizes_decoder,\n            self.args.upsample_kernel_sizes_decoder,\n            self.args.upsample_initial_channel_decoder,\n            self.args.upsample_rates_decoder,\n            inference_padding=0,\n            cond_channels=self.embedded_speaker_dim,\n            conv_pre_weight_norm=False,\n            conv_post_weight_norm=False,\n            conv_post_bias=False,\n        )\n\n        if self.args.init_discriminator:\n            self.disc = VitsDiscriminator(\n                periods=self.args.periods_multi_period_discriminator,\n                use_spectral_norm=self.args.use_spectral_norm_disriminator,\n            )\n\n    @property\n    def device(self):\n        return next(self.parameters()).device\n\n    def init_multispeaker(self, config: Coqpit):\n        \"\"\"Initialize multi-speaker modules of a model. A model can be trained either with a speaker embedding layer\n        or with external `d_vectors` computed from a speaker encoder model.\n\n        You must provide a `speaker_manager` at initialization to set up the multi-speaker modules.\n\n        Args:\n            config (Coqpit): Model configuration.\n            data (List, optional): Dataset items to infer number of speakers. Defaults to None.\n        \"\"\"\n        self.embedded_speaker_dim = 0\n        self.num_speakers = self.args.num_speakers\n        self.audio_transform = None\n\n        if self.speaker_manager:\n            self.num_speakers = self.speaker_manager.num_speakers\n\n        if self.args.use_speaker_embedding:\n            self._init_speaker_embedding()\n\n        if self.args.use_d_vector_file:\n            self._init_d_vector()\n\n        # TODO: make this a function\n        if self.args.use_speaker_encoder_as_loss:\n            if self.speaker_manager.encoder is None and (\n                not self.args.speaker_encoder_model_path or not self.args.speaker_encoder_config_path\n            ):\n                raise RuntimeError(\n                    \" [!] To use the speaker consistency loss (SCL) you need to specify speaker_encoder_model_path and speaker_encoder_config_path !!\"\n                )\n\n            self.speaker_manager.encoder.eval()\n            print(\" > External Speaker Encoder Loaded !!\")\n\n            if (\n                hasattr(self.speaker_manager.encoder, \"audio_config\")\n                and self.config.audio.sample_rate != self.speaker_manager.encoder.audio_config[\"sample_rate\"]\n            ):\n                self.audio_transform = torchaudio.transforms.Resample(\n                    orig_freq=self.config.audio.sample_rate,\n                    new_freq=self.speaker_manager.encoder.audio_config[\"sample_rate\"],\n                )\n\n    def _init_speaker_embedding(self):\n        # pylint: disable=attribute-defined-outside-init\n        if self.num_speakers > 0:\n            print(\" > initialization of speaker-embedding layers.\")\n            self.embedded_speaker_dim = self.args.speaker_embedding_channels\n            self.emb_g = nn.Embedding(self.num_speakers, self.embedded_speaker_dim)\n\n    def _init_d_vector(self):\n        # pylint: disable=attribute-defined-outside-init\n        if hasattr(self, \"emb_g\"):\n            raise ValueError(\"[!] Speaker embedding layer already initialized before d_vector settings.\")\n        self.embedded_speaker_dim = self.args.d_vector_dim\n\n    def init_multilingual(self, config: Coqpit):\n        \"\"\"Initialize multilingual modules of a model.\n\n        Args:\n            config (Coqpit): Model configuration.\n        \"\"\"\n        if self.args.language_ids_file is not None:\n            self.language_manager = LanguageManager(language_ids_file_path=config.language_ids_file)\n\n        if self.args.use_language_embedding and self.language_manager:\n            print(\" > initialization of language-embedding layers.\")\n            self.num_languages = self.language_manager.num_languages\n            self.embedded_language_dim = self.args.embedded_language_dim\n            self.emb_l = nn.Embedding(self.num_languages, self.embedded_language_dim)\n            torch.nn.init.xavier_uniform_(self.emb_l.weight)\n        else:\n            self.embedded_language_dim = 0\n\n    def init_upsampling(self):\n        \"\"\"\n        Initialize upsampling modules of a model.\n        \"\"\"\n        if self.args.encoder_sample_rate:\n            self.interpolate_factor = self.config.audio[\"sample_rate\"] / self.args.encoder_sample_rate\n            self.audio_resampler = torchaudio.transforms.Resample(\n                orig_freq=self.config.audio[\"sample_rate\"], new_freq=self.args.encoder_sample_rate\n            )  # pylint: disable=W0201\n\n    def on_epoch_start(self, trainer):  # pylint: disable=W0613\n        \"\"\"Freeze layers at the beginning of an epoch\"\"\"\n        self._freeze_layers()\n        # set the device of speaker encoder\n        if self.args.use_speaker_encoder_as_loss:\n            self.speaker_manager.encoder = self.speaker_manager.encoder.to(self.device)\n\n    def on_init_end(self, trainer):  # pylint: disable=W0613\n        \"\"\"Reinit layes if needed\"\"\"\n        if self.args.reinit_DP:\n            before_dict = get_module_weights_sum(self.duration_predictor)\n            # Applies weights_reset recursively to every submodule of the duration predictor\n            self.duration_predictor.apply(fn=weights_reset)\n            after_dict = get_module_weights_sum(self.duration_predictor)\n            for key, value in after_dict.items():\n                if value == before_dict[key]:\n                    raise RuntimeError(\" [!] The weights of Duration Predictor was not reinit check it !\")\n            print(\" > Duration Predictor was reinit.\")\n\n        if self.args.reinit_text_encoder:\n            before_dict = get_module_weights_sum(self.text_encoder)\n            # Applies weights_reset recursively to every submodule of the duration predictor\n            self.text_encoder.apply(fn=weights_reset)\n            after_dict = get_module_weights_sum(self.text_encoder)\n            for key, value in after_dict.items():\n                if value == before_dict[key]:\n                    raise RuntimeError(\" [!] The weights of Text Encoder was not reinit check it !\")\n            print(\" > Text Encoder was reinit.\")\n\n    def get_aux_input(self, aux_input: Dict):\n        sid, g, lid, _ = self._set_cond_input(aux_input)\n        return {\"speaker_ids\": sid, \"style_wav\": None, \"d_vectors\": g, \"language_ids\": lid}\n\n    def _freeze_layers(self):\n        if self.args.freeze_encoder:\n            for param in self.text_encoder.parameters():\n                param.requires_grad = False\n\n            if hasattr(self, \"emb_l\"):\n                for param in self.emb_l.parameters():\n                    param.requires_grad = False\n\n        if self.args.freeze_PE:\n            for param in self.posterior_encoder.parameters():\n                param.requires_grad = False\n\n        if self.args.freeze_DP:\n            for param in self.duration_predictor.parameters():\n                param.requires_grad = False\n\n        if self.args.freeze_flow_decoder:\n            for param in self.flow.parameters():\n                param.requires_grad = False\n\n        if self.args.freeze_waveform_decoder:\n            for param in self.waveform_decoder.parameters():\n                param.requires_grad = False\n\n    @staticmethod\n    def _set_cond_input(aux_input: Dict):\n        \"\"\"Set the speaker conditioning input based on the multi-speaker mode.\"\"\"\n        sid, g, lid, durations = None, None, None, None\n        if \"speaker_ids\" in aux_input and aux_input[\"speaker_ids\"] is not None:\n            sid = aux_input[\"speaker_ids\"]\n            if sid.ndim == 0:\n                sid = sid.unsqueeze_(0)\n        if \"d_vectors\" in aux_input and aux_input[\"d_vectors\"] is not None:\n            g = F.normalize(aux_input[\"d_vectors\"]).unsqueeze(-1)\n            if g.ndim == 2:\n                g = g.unsqueeze_(0)\n\n        if \"language_ids\" in aux_input and aux_input[\"language_ids\"] is not None:\n            lid = aux_input[\"language_ids\"]\n            if lid.ndim == 0:\n                lid = lid.unsqueeze_(0)\n\n        if \"durations\" in aux_input and aux_input[\"durations\"] is not None:\n            durations = aux_input[\"durations\"]\n\n        return sid, g, lid, durations\n\n    def _set_speaker_input(self, aux_input: Dict):\n        d_vectors = aux_input.get(\"d_vectors\", None)\n        speaker_ids = aux_input.get(\"speaker_ids\", None)\n\n        if d_vectors is not None and speaker_ids is not None:\n            raise ValueError(\"[!] Cannot use d-vectors and speaker-ids together.\")\n\n        if speaker_ids is not None and not hasattr(self, \"emb_g\"):\n            raise ValueError(\"[!] Cannot use speaker-ids without enabling speaker embedding.\")\n\n        g = speaker_ids if speaker_ids is not None else d_vectors\n        return g\n\n    def forward_mas(self, outputs, z_p, m_p, logs_p, x, x_mask, y_mask, g, lang_emb):\n        # find the alignment path\n        attn_mask = torch.unsqueeze(x_mask, -1) * torch.unsqueeze(y_mask, 2)\n        with torch.no_grad():\n            o_scale = torch.exp(-2 * logs_p)\n            logp1 = torch.sum(-0.5 * math.log(2 * math.pi) - logs_p, [1]).unsqueeze(-1)  # [b, t, 1]\n            logp2 = torch.einsum(\"klm, kln -> kmn\", [o_scale, -0.5 * (z_p**2)])\n            logp3 = torch.einsum(\"klm, kln -> kmn\", [m_p * o_scale, z_p])\n            logp4 = torch.sum(-0.5 * (m_p**2) * o_scale, [1]).unsqueeze(-1)  # [b, t, 1]\n            logp = logp2 + logp3 + logp1 + logp4\n            attn = maximum_path(logp, attn_mask.squeeze(1)).unsqueeze(1).detach()  # [b, 1, t, t']\n\n        # duration predictor\n        attn_durations = attn.sum(3)\n        if self.args.use_sdp:\n            loss_duration = self.duration_predictor(\n                x.detach() if self.args.detach_dp_input else x,\n                x_mask,\n                attn_durations,\n                g=g.detach() if self.args.detach_dp_input and g is not None else g,\n                lang_emb=lang_emb.detach() if self.args.detach_dp_input and lang_emb is not None else lang_emb,\n            )\n            loss_duration = loss_duration / torch.sum(x_mask)\n        else:\n            attn_log_durations = torch.log(attn_durations + 1e-6) * x_mask\n            log_durations = self.duration_predictor(\n                x.detach() if self.args.detach_dp_input else x,\n                x_mask,\n                g=g.detach() if self.args.detach_dp_input and g is not None else g,\n                lang_emb=lang_emb.detach() if self.args.detach_dp_input and lang_emb is not None else lang_emb,\n            )\n            loss_duration = torch.sum((log_durations - attn_log_durations) ** 2, [1, 2]) / torch.sum(x_mask)\n        outputs[\"loss_duration\"] = loss_duration\n        return outputs, attn\n\n    def upsampling_z(self, z, slice_ids=None, y_lengths=None, y_mask=None):\n        spec_segment_size = self.spec_segment_size\n        if self.args.encoder_sample_rate:\n            # recompute the slices and spec_segment_size if needed\n            slice_ids = slice_ids * int(self.interpolate_factor) if slice_ids is not None else slice_ids\n            spec_segment_size = spec_segment_size * int(self.interpolate_factor)\n            # interpolate z if needed\n            if self.args.interpolate_z:\n                z = torch.nn.functional.interpolate(z, scale_factor=[self.interpolate_factor], mode=\"linear\").squeeze(0)\n                # recompute the mask if needed\n                if y_lengths is not None and y_mask is not None:\n                    y_mask = (\n                        sequence_mask(y_lengths * self.interpolate_factor, None).to(y_mask.dtype).unsqueeze(1)\n                    )  # [B, 1, T_dec_resampled]\n\n        return z, spec_segment_size, slice_ids, y_mask\n\n    def forward(  # pylint: disable=dangerous-default-value\n        self,\n        x: torch.tensor,\n        x_lengths: torch.tensor,\n        y: torch.tensor,\n        y_lengths: torch.tensor,\n        waveform: torch.tensor,\n        aux_input={\"d_vectors\": None, \"speaker_ids\": None, \"language_ids\": None},\n    ) -> Dict:\n        \"\"\"Forward pass of the model.\n\n        Args:\n            x (torch.tensor): Batch of input character sequence IDs.\n            x_lengths (torch.tensor): Batch of input character sequence lengths.\n            y (torch.tensor): Batch of input spectrograms.\n            y_lengths (torch.tensor): Batch of input spectrogram lengths.\n            waveform (torch.tensor): Batch of ground truth waveforms per sample.\n            aux_input (dict, optional): Auxiliary inputs for multi-speaker and multi-lingual training.\n                Defaults to {\"d_vectors\": None, \"speaker_ids\": None, \"language_ids\": None}.\n\n        Returns:\n            Dict: model outputs keyed by the output name.\n\n        Shapes:\n            - x: :math:`[B, T_seq]`\n            - x_lengths: :math:`[B]`\n            - y: :math:`[B, C, T_spec]`\n            - y_lengths: :math:`[B]`\n            - waveform: :math:`[B, 1, T_wav]`\n            - d_vectors: :math:`[B, C, 1]`\n            - speaker_ids: :math:`[B]`\n            - language_ids: :math:`[B]`\n\n        Return Shapes:\n            - model_outputs: :math:`[B, 1, T_wav]`\n            - alignments: :math:`[B, T_seq, T_dec]`\n            - z: :math:`[B, C, T_dec]`\n            - z_p: :math:`[B, C, T_dec]`\n            - m_p: :math:`[B, C, T_dec]`\n            - logs_p: :math:`[B, C, T_dec]`\n            - m_q: :math:`[B, C, T_dec]`\n            - logs_q: :math:`[B, C, T_dec]`\n            - waveform_seg: :math:`[B, 1, spec_seg_size * hop_length]`\n            - gt_spk_emb: :math:`[B, 1, speaker_encoder.proj_dim]`\n            - syn_spk_emb: :math:`[B, 1, speaker_encoder.proj_dim]`\n        \"\"\"\n        outputs = {}\n        sid, g, lid, _ = self._set_cond_input(aux_input)\n        # speaker embedding\n        if self.args.use_speaker_embedding and sid is not None:\n            g = self.emb_g(sid).unsqueeze(-1)  # [b, h, 1]\n\n        # language embedding\n        lang_emb = None\n        if self.args.use_language_embedding and lid is not None:\n            lang_emb = self.emb_l(lid).unsqueeze(-1)\n\n        x, m_p, logs_p, x_mask = self.text_encoder(x, x_lengths, lang_emb=lang_emb)\n\n        # posterior encoder\n        z, m_q, logs_q, y_mask = self.posterior_encoder(y, y_lengths, g=g)\n\n        # flow layers\n        z_p = self.flow(z, y_mask, g=g)\n\n        # duration predictor\n        outputs, attn = self.forward_mas(outputs, z_p, m_p, logs_p, x, x_mask, y_mask, g=g, lang_emb=lang_emb)\n\n        # expand prior\n        m_p = torch.einsum(\"klmn, kjm -> kjn\", [attn, m_p])\n        logs_p = torch.einsum(\"klmn, kjm -> kjn\", [attn, logs_p])\n\n        # select a random feature segment for the waveform decoder\n        z_slice, slice_ids = rand_segments(z, y_lengths, self.spec_segment_size, let_short_samples=True, pad_short=True)\n\n        # interpolate z if needed\n        z_slice, spec_segment_size, slice_ids, _ = self.upsampling_z(z_slice, slice_ids=slice_ids)\n\n        o = self.waveform_decoder(z_slice, g=g)\n\n        wav_seg = segment(\n            waveform,\n            slice_ids * self.config.audio.hop_length,\n            spec_segment_size * self.config.audio.hop_length,\n            pad_short=True,\n        )\n\n        if self.args.use_speaker_encoder_as_loss and self.speaker_manager.encoder is not None:\n            # concate generated and GT waveforms\n            wavs_batch = torch.cat((wav_seg, o), dim=0)\n\n            # resample audio to speaker encoder sample_rate\n            # pylint: disable=W0105\n            if self.audio_transform is not None:\n                wavs_batch = self.audio_transform(wavs_batch)\n\n            pred_embs = self.speaker_manager.encoder.forward(wavs_batch, l2_norm=True)\n\n            # split generated and GT speaker embeddings\n            gt_spk_emb, syn_spk_emb = torch.chunk(pred_embs, 2, dim=0)\n        else:\n            gt_spk_emb, syn_spk_emb = None, None\n\n        outputs.update(\n            {\n                \"model_outputs\": o,\n                \"alignments\": attn.squeeze(1),\n                \"m_p\": m_p,\n                \"logs_p\": logs_p,\n                \"z\": z,\n                \"z_p\": z_p,\n                \"m_q\": m_q,\n                \"logs_q\": logs_q,\n                \"waveform_seg\": wav_seg,\n                \"gt_spk_emb\": gt_spk_emb,\n                \"syn_spk_emb\": syn_spk_emb,\n                \"slice_ids\": slice_ids,\n            }\n        )\n        return outputs\n\n    @staticmethod\n    def _set_x_lengths(x, aux_input):\n        if \"x_lengths\" in aux_input and aux_input[\"x_lengths\"] is not None:\n            return aux_input[\"x_lengths\"]\n        return torch.tensor(x.shape[1:2]).to(x.device)\n\n    @torch.no_grad()\n    def inference(\n        self,\n        x,\n        aux_input={\"x_lengths\": None, \"d_vectors\": None, \"speaker_ids\": None, \"language_ids\": None, \"durations\": None},\n    ):  # pylint: disable=dangerous-default-value\n        \"\"\"\n        Note:\n            To run in batch mode, provide `x_lengths` else model assumes that the batch size is 1.\n\n        Shapes:\n            - x: :math:`[B, T_seq]`\n            - x_lengths: :math:`[B]`\n            - d_vectors: :math:`[B, C]`\n            - speaker_ids: :math:`[B]`\n\n        Return Shapes:\n            - model_outputs: :math:`[B, 1, T_wav]`\n            - alignments: :math:`[B, T_seq, T_dec]`\n            - z: :math:`[B, C, T_dec]`\n            - z_p: :math:`[B, C, T_dec]`\n            - m_p: :math:`[B, C, T_dec]`\n            - logs_p: :math:`[B, C, T_dec]`\n        \"\"\"\n        sid, g, lid, durations = self._set_cond_input(aux_input)\n        x_lengths = self._set_x_lengths(x, aux_input)\n\n        # speaker embedding\n        if self.args.use_speaker_embedding and sid is not None:\n            g = self.emb_g(sid).unsqueeze(-1)\n\n        # language embedding\n        lang_emb = None\n        if self.args.use_language_embedding and lid is not None:\n            lang_emb = self.emb_l(lid).unsqueeze(-1)\n\n        x, m_p, logs_p, x_mask = self.text_encoder(x, x_lengths, lang_emb=lang_emb)\n\n        if durations is None:\n            if self.args.use_sdp:\n                logw = self.duration_predictor(\n                    x,\n                    x_mask,\n                    g=g if self.args.condition_dp_on_speaker else None,\n                    reverse=True,\n                    noise_scale=self.inference_noise_scale_dp,\n                    lang_emb=lang_emb,\n                )\n            else:\n                logw = self.duration_predictor(\n                    x, x_mask, g=g if self.args.condition_dp_on_speaker else None, lang_emb=lang_emb\n                )\n            w = torch.exp(logw) * x_mask * self.length_scale\n        else:\n            assert durations.shape[-1] == x.shape[-1]\n            w = durations.unsqueeze(0)\n\n        w_ceil = torch.ceil(w)\n        y_lengths = torch.clamp_min(torch.sum(w_ceil, [1, 2]), 1).long()\n        y_mask = sequence_mask(y_lengths, None).to(x_mask.dtype).unsqueeze(1)  # [B, 1, T_dec]\n\n        attn_mask = x_mask * y_mask.transpose(1, 2)  # [B, 1, T_enc] * [B, T_dec, 1]\n        attn = generate_path(w_ceil.squeeze(1), attn_mask.squeeze(1).transpose(1, 2))\n\n        m_p = torch.matmul(attn.transpose(1, 2), m_p.transpose(1, 2)).transpose(1, 2)\n        logs_p = torch.matmul(attn.transpose(1, 2), logs_p.transpose(1, 2)).transpose(1, 2)\n\n        z_p = m_p + torch.randn_like(m_p) * torch.exp(logs_p) * self.inference_noise_scale\n        z = self.flow(z_p, y_mask, g=g, reverse=True)\n\n        # upsampling if needed\n        z, _, _, y_mask = self.upsampling_z(z, y_lengths=y_lengths, y_mask=y_mask)\n\n        o = self.waveform_decoder((z * y_mask)[:, :, : self.max_inference_len], g=g)\n\n        outputs = {\n            \"model_outputs\": o,\n            \"alignments\": attn.squeeze(1),\n            \"durations\": w_ceil,\n            \"z\": z,\n            \"z_p\": z_p,\n            \"m_p\": m_p,\n            \"logs_p\": logs_p,\n            \"y_mask\": y_mask,\n        }\n        return outputs\n\n    @torch.no_grad()\n    def inference_voice_conversion(\n        self, reference_wav, speaker_id=None, d_vector=None, reference_speaker_id=None, reference_d_vector=None\n    ):\n        \"\"\"Inference for voice conversion\n\n        Args:\n            reference_wav (Tensor): Reference wavform. Tensor of shape [B, T]\n            speaker_id (Tensor): speaker_id of the target speaker. Tensor of shape [B]\n            d_vector (Tensor): d_vector embedding of target speaker. Tensor of shape `[B, C]`\n            reference_speaker_id (Tensor): speaker_id of the reference_wav speaker. Tensor of shape [B]\n            reference_d_vector (Tensor): d_vector embedding of the reference_wav speaker. Tensor of shape `[B, C]`\n        \"\"\"\n        # compute spectrograms\n        y = wav_to_spec(\n            reference_wav,\n            self.config.audio.fft_size,\n            self.config.audio.hop_length,\n            self.config.audio.win_length,\n            center=False,\n        )\n        y_lengths = torch.tensor([y.size(-1)]).to(y.device)\n        speaker_cond_src = reference_speaker_id if reference_speaker_id is not None else reference_d_vector\n        speaker_cond_tgt = speaker_id if speaker_id is not None else d_vector\n        wav, _, _ = self.voice_conversion(y, y_lengths, speaker_cond_src, speaker_cond_tgt)\n        return wav\n\n    def voice_conversion(self, y, y_lengths, speaker_cond_src, speaker_cond_tgt):\n        \"\"\"Forward pass for voice conversion\n\n        TODO: create an end-point for voice conversion\n\n        Args:\n            y (Tensor): Reference spectrograms. Tensor of shape [B, T, C]\n            y_lengths (Tensor): Length of each reference spectrogram. Tensor of shape [B]\n            speaker_cond_src (Tensor): Reference speaker ID. Tensor of shape [B,]\n            speaker_cond_tgt (Tensor): Target speaker ID. Tensor of shape [B,]\n        \"\"\"\n        assert self.num_speakers > 0, \"num_speakers have to be larger than 0.\"\n        # speaker embedding\n        if self.args.use_speaker_embedding and not self.args.use_d_vector_file:\n            g_src = self.emb_g(torch.from_numpy((np.array(speaker_cond_src))).unsqueeze(0)).unsqueeze(-1)\n            g_tgt = self.emb_g(torch.from_numpy((np.array(speaker_cond_tgt))).unsqueeze(0)).unsqueeze(-1)\n        elif not self.args.use_speaker_embedding and self.args.use_d_vector_file:\n            g_src = F.normalize(speaker_cond_src).unsqueeze(-1)\n            g_tgt = F.normalize(speaker_cond_tgt).unsqueeze(-1)\n        else:\n            raise RuntimeError(\" [!] Voice conversion is only supported on multi-speaker models.\")\n\n        z, _, _, y_mask = self.posterior_encoder(y, y_lengths, g=g_src)\n        z_p = self.flow(z, y_mask, g=g_src)\n        z_hat = self.flow(z_p, y_mask, g=g_tgt, reverse=True)\n        o_hat = self.waveform_decoder(z_hat * y_mask, g=g_tgt)\n        return o_hat, y_mask, (z, z_p, z_hat)\n\n    def train_step(self, batch: dict, criterion: nn.Module, optimizer_idx: int) -> Tuple[Dict, Dict]:\n        \"\"\"Perform a single training step. Run the model forward pass and compute losses.\n\n        Args:\n            batch (Dict): Input tensors.\n            criterion (nn.Module): Loss layer designed for the model.\n            optimizer_idx (int): Index of optimizer to use. 0 for the generator and 1 for the discriminator networks.\n\n        Returns:\n            Tuple[Dict, Dict]: Model ouputs and computed losses.\n        \"\"\"\n\n        spec_lens = batch[\"spec_lens\"]\n\n        if optimizer_idx == 0:\n            tokens = batch[\"tokens\"]\n            token_lenghts = batch[\"token_lens\"]\n            spec = batch[\"spec\"]\n\n            d_vectors = batch[\"d_vectors\"]\n            speaker_ids = batch[\"speaker_ids\"]\n            language_ids = batch[\"language_ids\"]\n            waveform = batch[\"waveform\"]\n\n            # generator pass\n            outputs = self.forward(\n                tokens,\n                token_lenghts,\n                spec,\n                spec_lens,\n                waveform,\n                aux_input={\"d_vectors\": d_vectors, \"speaker_ids\": speaker_ids, \"language_ids\": language_ids},\n            )\n\n            # cache tensors for the generator pass\n            self.model_outputs_cache = outputs  # pylint: disable=attribute-defined-outside-init\n\n            # compute scores and features\n            scores_disc_fake, _, scores_disc_real, _ = self.disc(\n                outputs[\"model_outputs\"].detach(), outputs[\"waveform_seg\"]\n            )\n\n            # compute loss\n            with autocast(enabled=False):  # use float32 for the criterion\n                loss_dict = criterion[optimizer_idx](\n                    scores_disc_real,\n                    scores_disc_fake,\n                )\n            return outputs, loss_dict\n\n        if optimizer_idx == 1:\n            mel = batch[\"mel\"]\n\n            # compute melspec segment\n            with autocast(enabled=False):\n                if self.args.encoder_sample_rate:\n                    spec_segment_size = self.spec_segment_size * int(self.interpolate_factor)\n                else:\n                    spec_segment_size = self.spec_segment_size\n\n                mel_slice = segment(\n                    mel.float(), self.model_outputs_cache[\"slice_ids\"], spec_segment_size, pad_short=True\n                )\n                mel_slice_hat = wav_to_mel(\n                    y=self.model_outputs_cache[\"model_outputs\"].float(),\n                    n_fft=self.config.audio.fft_size,\n                    sample_rate=self.config.audio.sample_rate,\n                    num_mels=self.config.audio.num_mels,\n                    hop_length=self.config.audio.hop_length,\n                    win_length=self.config.audio.win_length,\n                    fmin=self.config.audio.mel_fmin,\n                    fmax=self.config.audio.mel_fmax,\n                    center=False,\n                )\n\n            # compute discriminator scores and features\n            scores_disc_fake, feats_disc_fake, _, feats_disc_real = self.disc(\n                self.model_outputs_cache[\"model_outputs\"], self.model_outputs_cache[\"waveform_seg\"]\n            )\n\n            # compute losses\n            with autocast(enabled=False):  # use float32 for the criterion\n                loss_dict = criterion[optimizer_idx](\n                    mel_slice_hat=mel_slice.float(),\n                    mel_slice=mel_slice_hat.float(),\n                    z_p=self.model_outputs_cache[\"z_p\"].float(),\n                    logs_q=self.model_outputs_cache[\"logs_q\"].float(),\n                    m_p=self.model_outputs_cache[\"m_p\"].float(),\n                    logs_p=self.model_outputs_cache[\"logs_p\"].float(),\n                    z_len=spec_lens,\n                    scores_disc_fake=scores_disc_fake,\n                    feats_disc_fake=feats_disc_fake,\n                    feats_disc_real=feats_disc_real,\n                    loss_duration=self.model_outputs_cache[\"loss_duration\"],\n                    use_speaker_encoder_as_loss=self.args.use_speaker_encoder_as_loss,\n                    gt_spk_emb=self.model_outputs_cache[\"gt_spk_emb\"],\n                    syn_spk_emb=self.model_outputs_cache[\"syn_spk_emb\"],\n                )\n\n            return self.model_outputs_cache, loss_dict\n\n        raise ValueError(\" [!] Unexpected `optimizer_idx`.\")\n\n    def _log(self, ap, batch, outputs, name_prefix=\"train\"):  # pylint: disable=unused-argument,no-self-use\n        y_hat = outputs[1][\"model_outputs\"]\n        y = outputs[1][\"waveform_seg\"]\n        figures = plot_results(y_hat, y, ap, name_prefix)\n        sample_voice = y_hat[0].squeeze(0).detach().cpu().numpy()\n        audios = {f\"{name_prefix}/audio\": sample_voice}\n\n        alignments = outputs[1][\"alignments\"]\n        align_img = alignments[0].data.cpu().numpy().T\n\n        figures.update(\n            {\n                \"alignment\": plot_alignment(align_img, output_fig=False),\n            }\n        )\n        return figures, audios\n\n    def train_log(\n        self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int\n    ):  # pylint: disable=no-self-use\n        \"\"\"Create visualizations and waveform examples.\n\n        For example, here you can plot spectrograms and generate sample sample waveforms from these spectrograms to\n        be projected onto Tensorboard.\n\n        Args:\n            ap (AudioProcessor): audio processor used at training.\n            batch (Dict): Model inputs used at the previous training step.\n            outputs (Dict): Model outputs generated at the previoud training step.\n\n        Returns:\n            Tuple[Dict, np.ndarray]: training plots and output waveform.\n        \"\"\"\n        figures, audios = self._log(self.ap, batch, outputs, \"train\")\n        logger.train_figures(steps, figures)\n        logger.train_audios(steps, audios, self.ap.sample_rate)\n\n    @torch.no_grad()\n    def eval_step(self, batch: dict, criterion: nn.Module, optimizer_idx: int):\n        return self.train_step(batch, criterion, optimizer_idx)\n\n    def eval_log(self, batch: dict, outputs: dict, logger: \"Logger\", assets: dict, steps: int) -> None:\n        figures, audios = self._log(self.ap, batch, outputs, \"eval\")\n        logger.eval_figures(steps, figures)\n        logger.eval_audios(steps, audios, self.ap.sample_rate)\n\n    def get_aux_input_from_test_sentences(self, sentence_info):\n        if hasattr(self.config, \"model_args\"):\n            config = self.config.model_args\n        else:\n            config = self.config\n\n        # extract speaker and language info\n        text, speaker_name, style_wav, language_name = None, None, None, None\n\n        if isinstance(sentence_info, list):\n            if len(sentence_info) == 1:\n                text = sentence_info[0]\n            elif len(sentence_info) == 2:\n                text, speaker_name = sentence_info\n            elif len(sentence_info) == 3:\n                text, speaker_name, style_wav = sentence_info\n            elif len(sentence_info) == 4:\n                text, speaker_name, style_wav, language_name = sentence_info\n        else:\n            text = sentence_info\n\n        # get speaker  id/d_vector\n        speaker_id, d_vector, language_id = None, None, None\n        if hasattr(self, \"speaker_manager\"):\n            if config.use_d_vector_file:\n                if speaker_name is None:\n                    d_vector = self.speaker_manager.get_random_embedding()\n                else:\n                    d_vector = self.speaker_manager.get_mean_embedding(speaker_name, num_samples=None, randomize=False)\n            elif config.use_speaker_embedding:\n                if speaker_name is None:\n                    speaker_id = self.speaker_manager.get_random_id()\n                else:\n                    speaker_id = self.speaker_manager.name_to_id[speaker_name]\n\n        # get language id\n        if hasattr(self, \"language_manager\") and config.use_language_embedding and language_name is not None:\n            language_id = self.language_manager.name_to_id[language_name]\n\n        return {\n            \"text\": text,\n            \"speaker_id\": speaker_id,\n            \"style_wav\": style_wav,\n            \"d_vector\": d_vector,\n            \"language_id\": language_id,\n            \"language_name\": language_name,\n        }\n\n    @torch.no_grad()\n    def test_run(self, assets) -> Tuple[Dict, Dict]:\n        \"\"\"Generic test run for `tts` models used by `Trainer`.\n\n        You can override this for a different behaviour.\n\n        Returns:\n            Tuple[Dict, Dict]: Test figures and audios to be projected to Tensorboard.\n        \"\"\"\n        print(\" | > Synthesizing test sentences.\")\n        test_audios = {}\n        test_figures = {}\n        test_sentences = self.config.test_sentences\n        for idx, s_info in enumerate(test_sentences):\n            aux_inputs = self.get_aux_input_from_test_sentences(s_info)\n            wav, alignment, _, _ = synthesis(\n                self,\n                aux_inputs[\"text\"],\n                self.config,\n                \"cuda\" in str(next(self.parameters()).device),\n                speaker_id=aux_inputs[\"speaker_id\"],\n                d_vector=aux_inputs[\"d_vector\"],\n                style_wav=aux_inputs[\"style_wav\"],\n                language_id=aux_inputs[\"language_id\"],\n                use_griffin_lim=True,\n                do_trim_silence=False,\n            ).values()\n            test_audios[\"{}-audio\".format(idx)] = wav\n            test_figures[\"{}-alignment\".format(idx)] = plot_alignment(alignment.T, output_fig=False)\n        return {\"figures\": test_figures, \"audios\": test_audios}\n\n    def test_log(\n        self, outputs: dict, logger: \"Logger\", assets: dict, steps: int  # pylint: disable=unused-argument\n    ) -> None:\n        logger.test_audios(steps, outputs[\"audios\"], self.ap.sample_rate)\n        logger.test_figures(steps, outputs[\"figures\"])\n\n    def format_batch(self, batch: Dict) -> Dict:\n        \"\"\"Compute speaker, langugage IDs and d_vector for the batch if necessary.\"\"\"\n        speaker_ids = None\n        language_ids = None\n        d_vectors = None\n\n        # get numerical speaker ids from speaker names\n        if self.speaker_manager is not None and self.speaker_manager.name_to_id and self.args.use_speaker_embedding:\n            speaker_ids = [self.speaker_manager.name_to_id[sn] for sn in batch[\"speaker_names\"]]\n\n        if speaker_ids is not None:\n            speaker_ids = torch.LongTensor(speaker_ids)\n\n        # get d_vectors from audio file names\n        if self.speaker_manager is not None and self.speaker_manager.embeddings and self.args.use_d_vector_file:\n            d_vector_mapping = self.speaker_manager.embeddings\n            d_vectors = [d_vector_mapping[w][\"embedding\"] for w in batch[\"audio_unique_names\"]]\n            d_vectors = torch.FloatTensor(d_vectors)\n\n        # get language ids from language names\n        if self.language_manager is not None and self.language_manager.name_to_id and self.args.use_language_embedding:\n            language_ids = [self.language_manager.name_to_id[ln] for ln in batch[\"language_names\"]]\n\n        if language_ids is not None:\n            language_ids = torch.LongTensor(language_ids)\n\n        batch[\"language_ids\"] = language_ids\n        batch[\"d_vectors\"] = d_vectors\n        batch[\"speaker_ids\"] = speaker_ids\n        return batch\n\n    def format_batch_on_device(self, batch):\n        \"\"\"Compute spectrograms on the device.\"\"\"\n        ac = self.config.audio\n\n        if self.args.encoder_sample_rate:\n            wav = self.audio_resampler(batch[\"waveform\"])\n        else:\n            wav = batch[\"waveform\"]\n\n        # compute spectrograms\n        batch[\"spec\"] = wav_to_spec(wav, ac.fft_size, ac.hop_length, ac.win_length, center=False)\n\n        if self.args.encoder_sample_rate:\n            # recompute spec with high sampling rate to the loss\n            spec_mel = wav_to_spec(batch[\"waveform\"], ac.fft_size, ac.hop_length, ac.win_length, center=False)\n            # remove extra stft frames if needed\n            if spec_mel.size(2) > int(batch[\"spec\"].size(2) * self.interpolate_factor):\n                spec_mel = spec_mel[:, :, : int(batch[\"spec\"].size(2) * self.interpolate_factor)]\n            else:\n                batch[\"spec\"] = batch[\"spec\"][:, :, : int(spec_mel.size(2) / self.interpolate_factor)]\n        else:\n            spec_mel = batch[\"spec\"]\n\n        batch[\"mel\"] = spec_to_mel(\n            spec=spec_mel,\n            n_fft=ac.fft_size,\n            num_mels=ac.num_mels,\n            sample_rate=ac.sample_rate,\n            fmin=ac.mel_fmin,\n            fmax=ac.mel_fmax,\n        )\n\n        if self.args.encoder_sample_rate:\n            assert batch[\"spec\"].shape[2] == int(\n                batch[\"mel\"].shape[2] / self.interpolate_factor\n            ), f\"{batch['spec'].shape[2]}, {batch['mel'].shape[2]}\"\n        else:\n            assert batch[\"spec\"].shape[2] == batch[\"mel\"].shape[2], f\"{batch['spec'].shape[2]}, {batch['mel'].shape[2]}\"\n\n        # compute spectrogram frame lengths\n        batch[\"spec_lens\"] = (batch[\"spec\"].shape[2] * batch[\"waveform_rel_lens\"]).int()\n        batch[\"mel_lens\"] = (batch[\"mel\"].shape[2] * batch[\"waveform_rel_lens\"]).int()\n\n        if self.args.encoder_sample_rate:\n            assert (batch[\"spec_lens\"] - (batch[\"mel_lens\"] / self.interpolate_factor).int()).sum() == 0\n        else:\n            assert (batch[\"spec_lens\"] - batch[\"mel_lens\"]).sum() == 0\n\n        # zero the padding frames\n        batch[\"spec\"] = batch[\"spec\"] * sequence_mask(batch[\"spec_lens\"]).unsqueeze(1)\n        batch[\"mel\"] = batch[\"mel\"] * sequence_mask(batch[\"mel_lens\"]).unsqueeze(1)\n        return batch\n\n    def get_sampler(self, config: Coqpit, dataset: TTSDataset, num_gpus=1, is_eval=False):\n        weights = None\n        data_items = dataset.samples\n        if getattr(config, \"use_weighted_sampler\", False):\n            for attr_name, alpha in config.weighted_sampler_attrs.items():\n                print(f\" > Using weighted sampler for attribute '{attr_name}' with alpha '{alpha}'\")\n                multi_dict = config.weighted_sampler_multipliers.get(attr_name, None)\n                print(multi_dict)\n                weights, attr_names, attr_weights = get_attribute_balancer_weights(\n                    attr_name=attr_name, items=data_items, multi_dict=multi_dict\n                )\n                weights = weights * alpha\n                print(f\" > Attribute weights for '{attr_names}' \\n | > {attr_weights}\")\n\n        # input_audio_lenghts = [os.path.getsize(x[\"audio_file\"]) for x in data_items]\n\n        if weights is not None:\n            w_sampler = WeightedRandomSampler(weights, len(weights))\n            batch_sampler = BucketBatchSampler(\n                w_sampler,\n                data=data_items,\n                batch_size=config.eval_batch_size if is_eval else config.batch_size,\n                sort_key=lambda x: os.path.getsize(x[\"audio_file\"]),\n                drop_last=True,\n            )\n        else:\n            batch_sampler = None\n        # sampler for DDP\n        if batch_sampler is None:\n            batch_sampler = DistributedSampler(dataset) if num_gpus > 1 else None\n        else:  # If a sampler is already defined use this sampler and DDP sampler together\n            batch_sampler = (\n                DistributedSamplerWrapper(batch_sampler) if num_gpus > 1 else batch_sampler\n            )  # TODO: check batch_sampler with multi-gpu\n        return batch_sampler\n\n    def get_data_loader(\n        self,\n        config: Coqpit,\n        assets: Dict,\n        is_eval: bool,\n        samples: Union[List[Dict], List[List]],\n        verbose: bool,\n        num_gpus: int,\n        rank: int = None,\n    ) -> \"DataLoader\":\n        if is_eval and not config.run_eval:\n            loader = None\n        else:\n            # init dataloader\n            dataset = VitsDataset(\n                model_args=self.args,\n                samples=samples,\n                batch_group_size=0 if is_eval else config.batch_group_size * config.batch_size,\n                min_text_len=config.min_text_len,\n                max_text_len=config.max_text_len,\n                min_audio_len=config.min_audio_len,\n                max_audio_len=config.max_audio_len,\n                phoneme_cache_path=config.phoneme_cache_path,\n                precompute_num_workers=config.precompute_num_workers,\n                verbose=verbose,\n                tokenizer=self.tokenizer,\n                start_by_longest=config.start_by_longest,\n            )\n\n            # wait all the DDP process to be ready\n            if num_gpus > 1:\n                dist.barrier()\n\n            # sort input sequences from short to long\n            dataset.preprocess_samples()\n\n            # get samplers\n            sampler = self.get_sampler(config, dataset, num_gpus)\n            if sampler is None:\n                loader = DataLoader(\n                    dataset,\n                    batch_size=config.eval_batch_size if is_eval else config.batch_size,\n                    shuffle=False,  # shuffle is done in the dataset.\n                    collate_fn=dataset.collate_fn,\n                    drop_last=False,  # setting this False might cause issues in AMP training.\n                    num_workers=config.num_eval_loader_workers if is_eval else config.num_loader_workers,\n                    pin_memory=False,\n                )\n            else:\n                if num_gpus > 1:\n                    loader = DataLoader(\n                        dataset,\n                        sampler=sampler,\n                        batch_size=config.eval_batch_size if is_eval else config.batch_size,\n                        collate_fn=dataset.collate_fn,\n                        num_workers=config.num_eval_loader_workers if is_eval else config.num_loader_workers,\n                        pin_memory=False,\n                    )\n                else:\n                    loader = DataLoader(\n                        dataset,\n                        batch_sampler=sampler,\n                        collate_fn=dataset.collate_fn,\n                        num_workers=config.num_eval_loader_workers if is_eval else config.num_loader_workers,\n                        pin_memory=False,\n                    )\n        return loader\n\n    def get_optimizer(self) -> List:\n        \"\"\"Initiate and return the GAN optimizers based on the config parameters.\n        It returnes 2 optimizers in a list. First one is for the generator and the second one is for the discriminator.\n        Returns:\n            List: optimizers.\n        \"\"\"\n        # select generator parameters\n        optimizer0 = get_optimizer(self.config.optimizer, self.config.optimizer_params, self.config.lr_disc, self.disc)\n\n        gen_parameters = chain(params for k, params in self.named_parameters() if not k.startswith(\"disc.\"))\n        optimizer1 = get_optimizer(\n            self.config.optimizer, self.config.optimizer_params, self.config.lr_gen, parameters=gen_parameters\n        )\n        return [optimizer0, optimizer1]\n\n    def get_lr(self) -> List:\n        \"\"\"Set the initial learning rates for each optimizer.\n\n        Returns:\n            List: learning rates for each optimizer.\n        \"\"\"\n        return [self.config.lr_disc, self.config.lr_gen]\n\n    def get_scheduler(self, optimizer) -> List:\n        \"\"\"Set the schedulers for each optimizer.\n\n        Args:\n            optimizer (List[`torch.optim.Optimizer`]): List of optimizers.\n\n        Returns:\n            List: Schedulers, one for each optimizer.\n        \"\"\"\n        scheduler_D = get_scheduler(self.config.lr_scheduler_disc, self.config.lr_scheduler_disc_params, optimizer[0])\n        scheduler_G = get_scheduler(self.config.lr_scheduler_gen, self.config.lr_scheduler_gen_params, optimizer[1])\n        return [scheduler_D, scheduler_G]\n\n    def get_criterion(self):\n        \"\"\"Get criterions for each optimizer. The index in the output list matches the optimizer idx used in\n        `train_step()`\"\"\"\n        from TTS.tts.layers.losses import (  # pylint: disable=import-outside-toplevel\n            VitsDiscriminatorLoss,\n            VitsGeneratorLoss,\n        )\n\n        return [VitsDiscriminatorLoss(self.config), VitsGeneratorLoss(self.config)]\n\n    def load_checkpoint(\n        self, config, checkpoint_path, eval=False, strict=True, cache=False\n    ):  # pylint: disable=unused-argument, redefined-builtin\n        \"\"\"Load the model checkpoint and setup for training or inference\"\"\"\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), cache=cache)\n        # compat band-aid for the pre-trained models to not use the encoder baked into the model\n        # TODO: consider baking the speaker encoder into the model and call it from there.\n        # as it is probably easier for model distribution.\n        state[\"model\"] = {k: v for k, v in state[\"model\"].items() if \"speaker_encoder\" not in k}\n\n        if self.args.encoder_sample_rate is not None and eval:\n            # audio resampler is not used in inference time\n            self.audio_resampler = None\n\n        # handle fine-tuning from a checkpoint with additional speakers\n        if hasattr(self, \"emb_g\") and state[\"model\"][\"emb_g.weight\"].shape != self.emb_g.weight.shape:\n            num_new_speakers = self.emb_g.weight.shape[0] - state[\"model\"][\"emb_g.weight\"].shape[0]\n            print(f\" > Loading checkpoint with {num_new_speakers} additional speakers.\")\n            emb_g = state[\"model\"][\"emb_g.weight\"]\n            new_row = torch.randn(num_new_speakers, emb_g.shape[1])\n            emb_g = torch.cat([emb_g, new_row], axis=0)\n            state[\"model\"][\"emb_g.weight\"] = emb_g\n        # load the model weights\n        self.load_state_dict(state[\"model\"], strict=strict)\n\n        if eval:\n            self.eval()\n            assert not self.training\n\n    @staticmethod\n    def init_from_config(config: \"VitsConfig\", samples: Union[List[List], List[Dict]] = None, verbose=True):\n        \"\"\"Initiate model from config\n\n        Args:\n            config (VitsConfig): Model config.\n            samples (Union[List[List], List[Dict]]): Training samples to parse speaker ids for training.\n                Defaults to None.\n        \"\"\"\n        from TTS.utils.audio import AudioProcessor\n\n        upsample_rate = torch.prod(torch.as_tensor(config.model_args.upsample_rates_decoder)).item()\n\n        if not config.model_args.encoder_sample_rate:\n            assert (\n                upsample_rate == config.audio.hop_length\n            ), f\" [!] Product of upsample rates must be equal to the hop length - {upsample_rate} vs {config.audio.hop_length}\"\n        else:\n            encoder_to_vocoder_upsampling_factor = config.audio.sample_rate / config.model_args.encoder_sample_rate\n            effective_hop_length = config.audio.hop_length * encoder_to_vocoder_upsampling_factor\n            assert (\n                upsample_rate == effective_hop_length\n            ), f\" [!] Product of upsample rates must be equal to the hop length - {upsample_rate} vs {effective_hop_length}\"\n\n        ap = AudioProcessor.init_from_config(config, verbose=verbose)\n        tokenizer, new_config = TTSTokenizer.init_from_config(config)\n        speaker_manager = SpeakerManager.init_from_config(config, samples)\n        language_manager = LanguageManager.init_from_config(config)\n\n        if config.model_args.speaker_encoder_model_path:\n            speaker_manager.init_encoder(\n                config.model_args.speaker_encoder_model_path, config.model_args.speaker_encoder_config_path\n            )\n        return Vits(new_config, ap, tokenizer, speaker_manager, language_manager)\n\n\n##################################\n# VITS CHARACTERS\n##################################\n\n\nclass VitsCharacters(BaseCharacters):\n    \"\"\"Characters class for VITs model for compatibility with pre-trained models\"\"\"\n\n    def __init__(\n        self,\n        graphemes: str = _characters,\n        punctuations: str = _punctuations,\n        pad: str = _pad,\n        ipa_characters: str = _phonemes,\n    ) -> None:\n        if ipa_characters is not None:\n            graphemes += ipa_characters\n        super().__init__(graphemes, punctuations, pad, None, None, \"<BLNK>\", is_unique=False, is_sorted=True)\n\n    def _create_vocab(self):\n        self._vocab = [self._pad] + list(self._punctuations) + list(self._characters) + [self._blank]\n        self._char_to_id = {char: idx for idx, char in enumerate(self.vocab)}\n        # pylint: disable=unnecessary-comprehension\n        self._id_to_char = {idx: char for idx, char in enumerate(self.vocab)}\n\n    @staticmethod\n    def init_from_config(config: Coqpit):\n        if config.characters is not None:\n            _pad = config.characters[\"pad\"]\n            _punctuations = config.characters[\"punctuations\"]\n            _letters = config.characters[\"characters\"]\n            _letters_ipa = config.characters[\"phonemes\"]\n            return (\n                VitsCharacters(graphemes=_letters, ipa_characters=_letters_ipa, punctuations=_punctuations, pad=_pad),\n                config,\n            )\n        characters = VitsCharacters()\n        new_config = replace(config, characters=characters.to_config())\n        return characters, new_config\n\n    def to_config(self) -> \"CharactersConfig\":\n        return CharactersConfig(\n            characters=self._characters,\n            punctuations=self._punctuations,\n            pad=self._pad,\n            eos=None,\n            bos=None,\n            blank=self._blank,\n            is_unique=False,\n            is_sorted=True,\n        )\n"
  },
  {
    "path": "TTS/tts/utils/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/utils/data.py",
    "content": "import bisect\n\nimport numpy as np\nimport torch\n\n\ndef _pad_data(x, length):\n    _pad = 0\n    assert x.ndim == 1\n    return np.pad(x, (0, length - x.shape[0]), mode=\"constant\", constant_values=_pad)\n\n\ndef prepare_data(inputs):\n    max_len = max((len(x) for x in inputs))\n    return np.stack([_pad_data(x, max_len) for x in inputs])\n\n\ndef _pad_tensor(x, length):\n    _pad = 0.0\n    assert x.ndim == 2\n    x = np.pad(x, [[0, 0], [0, length - x.shape[1]]], mode=\"constant\", constant_values=_pad)\n    return x\n\n\ndef prepare_tensor(inputs, out_steps):\n    max_len = max((x.shape[1] for x in inputs))\n    remainder = max_len % out_steps\n    pad_len = max_len + (out_steps - remainder) if remainder > 0 else max_len\n    return np.stack([_pad_tensor(x, pad_len) for x in inputs])\n\n\ndef _pad_stop_target(x: np.ndarray, length: int, pad_val=1) -> np.ndarray:\n    \"\"\"Pad stop target array.\n\n    Args:\n        x (np.ndarray): Stop target array.\n        length (int): Length after padding.\n        pad_val (int, optional): Padding value. Defaults to 1.\n\n    Returns:\n        np.ndarray: Padded stop target array.\n    \"\"\"\n    assert x.ndim == 1\n    return np.pad(x, (0, length - x.shape[0]), mode=\"constant\", constant_values=pad_val)\n\n\ndef prepare_stop_target(inputs, out_steps):\n    \"\"\"Pad row vectors with 1.\"\"\"\n    max_len = max((x.shape[0] for x in inputs))\n    remainder = max_len % out_steps\n    pad_len = max_len + (out_steps - remainder) if remainder > 0 else max_len\n    return np.stack([_pad_stop_target(x, pad_len) for x in inputs])\n\n\ndef pad_per_step(inputs, pad_len):\n    return np.pad(inputs, [[0, 0], [0, 0], [0, pad_len]], mode=\"constant\", constant_values=0.0)\n\n\ndef get_length_balancer_weights(items: list, num_buckets=10):\n    # get all durations\n    audio_lengths = np.array([item[\"audio_length\"] for item in items])\n    # create the $num_buckets buckets classes based in the dataset max and min length\n    max_length = int(max(audio_lengths))\n    min_length = int(min(audio_lengths))\n    step = int((max_length - min_length) / num_buckets) + 1\n    buckets_classes = [i + step for i in range(min_length, (max_length - step) + num_buckets + 1, step)]\n    # add each sample in their respective length bucket\n    buckets_names = np.array(\n        [buckets_classes[bisect.bisect_left(buckets_classes, item[\"audio_length\"])] for item in items]\n    )\n    # count and compute the weights_bucket for each sample\n    unique_buckets_names = np.unique(buckets_names).tolist()\n    bucket_ids = [unique_buckets_names.index(l) for l in buckets_names]\n    bucket_count = np.array([len(np.where(buckets_names == l)[0]) for l in unique_buckets_names])\n    weight_bucket = 1.0 / bucket_count\n    dataset_samples_weight = np.array([weight_bucket[l] for l in bucket_ids])\n    # normalize\n    dataset_samples_weight = dataset_samples_weight / np.linalg.norm(dataset_samples_weight)\n    return torch.from_numpy(dataset_samples_weight).float()\n"
  },
  {
    "path": "TTS/tts/utils/helpers.py",
    "content": "import numpy as np\nimport torch\nfrom torch.nn import functional as F\n\ntry:\n    from TTS.tts.utils.monotonic_align.core import maximum_path_c\n\n    CYTHON = True\nexcept ModuleNotFoundError:\n    CYTHON = False\n\n\nclass StandardScaler:\n    \"\"\"StandardScaler for mean-scale normalization with the given mean and scale values.\"\"\"\n\n    def __init__(self, mean: np.ndarray = None, scale: np.ndarray = None) -> None:\n        self.mean_ = mean\n        self.scale_ = scale\n\n    def set_stats(self, mean, scale):\n        self.mean_ = mean\n        self.scale_ = scale\n\n    def reset_stats(self):\n        delattr(self, \"mean_\")\n        delattr(self, \"scale_\")\n\n    def transform(self, X):\n        X = np.asarray(X)\n        X -= self.mean_\n        X /= self.scale_\n        return X\n\n    def inverse_transform(self, X):\n        X = np.asarray(X)\n        X *= self.scale_\n        X += self.mean_\n        return X\n\n\n# from https://gist.github.com/jihunchoi/f1434a77df9db1bb337417854b398df1\ndef sequence_mask(sequence_length, max_len=None):\n    \"\"\"Create a sequence mask for filtering padding in a sequence tensor.\n\n    Args:\n        sequence_length (torch.tensor): Sequence lengths.\n        max_len (int, Optional): Maximum sequence length. Defaults to None.\n\n    Shapes:\n        - mask: :math:`[B, T_max]`\n    \"\"\"\n    if max_len is None:\n        max_len = sequence_length.data.max()\n    seq_range = torch.arange(max_len, dtype=sequence_length.dtype, device=sequence_length.device)\n    # B x T_max\n    mask = seq_range.unsqueeze(0) < sequence_length.unsqueeze(1)\n    return mask\n\n\ndef segment(x: torch.tensor, segment_indices: torch.tensor, segment_size=4, pad_short=False):\n    \"\"\"Segment each sample in a batch based on the provided segment indices\n\n    Args:\n        x (torch.tensor): Input tensor.\n        segment_indices (torch.tensor): Segment indices.\n        segment_size (int): Expected output segment size.\n        pad_short (bool): Pad the end of input tensor with zeros if shorter than the segment size.\n    \"\"\"\n    # pad the input tensor if it is shorter than the segment size\n    if pad_short and x.shape[-1] < segment_size:\n        x = torch.nn.functional.pad(x, (0, segment_size - x.size(2)))\n\n    segments = torch.zeros_like(x[:, :, :segment_size])\n\n    for i in range(x.size(0)):\n        index_start = segment_indices[i]\n        index_end = index_start + segment_size\n        x_i = x[i]\n        if pad_short and index_end >= x.size(2):\n            # pad the sample if it is shorter than the segment size\n            x_i = torch.nn.functional.pad(x_i, (0, (index_end + 1) - x.size(2)))\n        segments[i] = x_i[:, index_start:index_end]\n    return segments\n\n\ndef rand_segments(\n    x: torch.tensor, x_lengths: torch.tensor = None, segment_size=4, let_short_samples=False, pad_short=False\n):\n    \"\"\"Create random segments based on the input lengths.\n\n    Args:\n        x (torch.tensor): Input tensor.\n        x_lengths (torch.tensor): Input lengths.\n        segment_size (int): Expected output segment size.\n        let_short_samples (bool): Allow shorter samples than the segment size.\n        pad_short (bool): Pad the end of input tensor with zeros if shorter than the segment size.\n\n    Shapes:\n        - x: :math:`[B, C, T]`\n        - x_lengths: :math:`[B]`\n    \"\"\"\n    _x_lenghts = x_lengths.clone()\n    B, _, T = x.size()\n    if pad_short:\n        if T < segment_size:\n            x = torch.nn.functional.pad(x, (0, segment_size - T))\n            T = segment_size\n    if _x_lenghts is None:\n        _x_lenghts = T\n    len_diff = _x_lenghts - segment_size\n    if let_short_samples:\n        _x_lenghts[len_diff < 0] = segment_size\n        len_diff = _x_lenghts - segment_size\n    else:\n        assert all(\n            len_diff > 0\n        ), f\" [!] At least one sample is shorter than the segment size ({segment_size}). \\n {_x_lenghts}\"\n    segment_indices = (torch.rand([B]).type_as(x) * (len_diff + 1)).long()\n    ret = segment(x, segment_indices, segment_size, pad_short=pad_short)\n    return ret, segment_indices\n\n\ndef average_over_durations(values, durs):\n    \"\"\"Average values over durations.\n\n    Shapes:\n        - values: :math:`[B, 1, T_de]`\n        - durs: :math:`[B, T_en]`\n        - avg: :math:`[B, 1, T_en]`\n    \"\"\"\n    durs_cums_ends = torch.cumsum(durs, dim=1).long()\n    durs_cums_starts = torch.nn.functional.pad(durs_cums_ends[:, :-1], (1, 0))\n    values_nonzero_cums = torch.nn.functional.pad(torch.cumsum(values != 0.0, dim=2), (1, 0))\n    values_cums = torch.nn.functional.pad(torch.cumsum(values, dim=2), (1, 0))\n\n    bs, l = durs_cums_ends.size()\n    n_formants = values.size(1)\n    dcs = durs_cums_starts[:, None, :].expand(bs, n_formants, l)\n    dce = durs_cums_ends[:, None, :].expand(bs, n_formants, l)\n\n    values_sums = (torch.gather(values_cums, 2, dce) - torch.gather(values_cums, 2, dcs)).float()\n    values_nelems = (torch.gather(values_nonzero_cums, 2, dce) - torch.gather(values_nonzero_cums, 2, dcs)).float()\n\n    avg = torch.where(values_nelems == 0.0, values_nelems, values_sums / values_nelems)\n    return avg\n\n\ndef convert_pad_shape(pad_shape):\n    l = pad_shape[::-1]\n    pad_shape = [item for sublist in l for item in sublist]\n    return pad_shape\n\n\ndef generate_path(duration, mask):\n    \"\"\"\n    Shapes:\n        - duration: :math:`[B, T_en]`\n        - mask: :math:'[B, T_en, T_de]`\n        - path: :math:`[B, T_en, T_de]`\n    \"\"\"\n    device = duration.device\n    b, t_x, t_y = mask.shape\n    cum_duration = torch.cumsum(duration, 1)\n    path = torch.zeros(b, t_x, t_y, dtype=mask.dtype).to(device=device)\n\n    cum_duration_flat = cum_duration.view(b * t_x)\n    path = sequence_mask(cum_duration_flat, t_y).to(mask.dtype)\n    path = path.view(b, t_x, t_y)\n    path = path - F.pad(path, convert_pad_shape([[0, 0], [1, 0], [0, 0]]))[:, :-1]\n    path = path * mask\n    return path\n\n\ndef maximum_path(value, mask):\n    if CYTHON:\n        return maximum_path_cython(value, mask)\n    return maximum_path_numpy(value, mask)\n\n\ndef maximum_path_cython(value, mask):\n    \"\"\"Cython optimised version.\n    Shapes:\n        - value: :math:`[B, T_en, T_de]`\n        - mask: :math:`[B, T_en, T_de]`\n    \"\"\"\n    value = value * mask\n    device = value.device\n    dtype = value.dtype\n    value = value.data.cpu().numpy().astype(np.float32)\n    path = np.zeros_like(value).astype(np.int32)\n    mask = mask.data.cpu().numpy()\n\n    t_x_max = mask.sum(1)[:, 0].astype(np.int32)\n    t_y_max = mask.sum(2)[:, 0].astype(np.int32)\n    maximum_path_c(path, value, t_x_max, t_y_max)\n    return torch.from_numpy(path).to(device=device, dtype=dtype)\n\n\ndef maximum_path_numpy(value, mask, max_neg_val=None):\n    \"\"\"\n    Monotonic alignment search algorithm\n    Numpy-friendly version. It's about 4 times faster than torch version.\n    value: [b, t_x, t_y]\n    mask: [b, t_x, t_y]\n    \"\"\"\n    if max_neg_val is None:\n        max_neg_val = -np.inf  # Patch for Sphinx complaint\n    value = value * mask\n\n    device = value.device\n    dtype = value.dtype\n    value = value.cpu().detach().numpy()\n    mask = mask.cpu().detach().numpy().astype(np.bool)\n\n    b, t_x, t_y = value.shape\n    direction = np.zeros(value.shape, dtype=np.int64)\n    v = np.zeros((b, t_x), dtype=np.float32)\n    x_range = np.arange(t_x, dtype=np.float32).reshape(1, -1)\n    for j in range(t_y):\n        v0 = np.pad(v, [[0, 0], [1, 0]], mode=\"constant\", constant_values=max_neg_val)[:, :-1]\n        v1 = v\n        max_mask = v1 >= v0\n        v_max = np.where(max_mask, v1, v0)\n        direction[:, :, j] = max_mask\n\n        index_mask = x_range <= j\n        v = np.where(index_mask, v_max + value[:, :, j], max_neg_val)\n    direction = np.where(mask, direction, 1)\n\n    path = np.zeros(value.shape, dtype=np.float32)\n    index = mask[:, :, 0].sum(1).astype(np.int64) - 1\n    index_range = np.arange(b)\n    for j in reversed(range(t_y)):\n        path[index_range, index, j] = 1\n        index = index + direction[index_range, index, j] - 1\n    path = path * mask.astype(np.float32)\n    path = torch.from_numpy(path).to(device=device, dtype=dtype)\n    return path\n"
  },
  {
    "path": "TTS/tts/utils/languages.py",
    "content": "import os\nfrom typing import Any, Dict, List\n\nimport fsspec\nimport numpy as np\nimport torch\nfrom coqpit import Coqpit\n\nfrom TTS.config import check_config_and_model_args\nfrom TTS.tts.utils.managers import BaseIDManager\n\n\nclass LanguageManager(BaseIDManager):\n    \"\"\"Manage the languages for multi-lingual 🐸TTS models. Load a datafile and parse the information\n    in a way that can be queried by language.\n\n    Args:\n        language_ids_file_path (str, optional): Path to the metafile that maps language names to ids used by\n        TTS models. Defaults to \"\".\n        config (Coqpit, optional): Coqpit config that contains the language information in the datasets filed.\n        Defaults to None.\n\n    Examples:\n        >>> manager = LanguageManager(language_ids_file_path=language_ids_file_path)\n        >>> language_id_mapper = manager.language_ids\n    \"\"\"\n\n    def __init__(\n        self,\n        language_ids_file_path: str = \"\",\n        config: Coqpit = None,\n    ):\n        super().__init__(id_file_path=language_ids_file_path)\n\n        if config:\n            self.set_language_ids_from_config(config)\n\n    @property\n    def num_languages(self) -> int:\n        return len(list(self.name_to_id.keys()))\n\n    @property\n    def language_names(self) -> List:\n        return list(self.name_to_id.keys())\n\n    @staticmethod\n    def parse_language_ids_from_config(c: Coqpit) -> Dict:\n        \"\"\"Set language id from config.\n\n        Args:\n            c (Coqpit): Config\n\n        Returns:\n            Tuple[Dict, int]: Language ID mapping and the number of languages.\n        \"\"\"\n        languages = set({})\n        for dataset in c.datasets:\n            if \"language\" in dataset:\n                languages.add(dataset[\"language\"])\n            else:\n                raise ValueError(f\"Dataset {dataset['name']} has no language specified.\")\n        return {name: i for i, name in enumerate(sorted(list(languages)))}\n\n    def set_language_ids_from_config(self, c: Coqpit) -> None:\n        \"\"\"Set language IDs from config samples.\n\n        Args:\n            c (Coqpit): Config.\n        \"\"\"\n        self.name_to_id = self.parse_language_ids_from_config(c)\n\n    @staticmethod\n    def parse_ids_from_data(items: List, parse_key: str) -> Any:\n        raise NotImplementedError\n\n    def set_ids_from_data(self, items: List, parse_key: str) -> Any:\n        raise NotImplementedError\n\n    def save_ids_to_file(self, file_path: str) -> None:\n        \"\"\"Save language IDs to a json file.\n\n        Args:\n            file_path (str): Path to the output file.\n        \"\"\"\n        self._save_json(file_path, self.name_to_id)\n\n    @staticmethod\n    def init_from_config(config: Coqpit) -> \"LanguageManager\":\n        \"\"\"Initialize the language manager from a Coqpit config.\n\n        Args:\n            config (Coqpit): Coqpit config.\n        \"\"\"\n        language_manager = None\n        if check_config_and_model_args(config, \"use_language_embedding\", True):\n            if config.get(\"language_ids_file\", None):\n                language_manager = LanguageManager(language_ids_file_path=config.language_ids_file)\n            language_manager = LanguageManager(config=config)\n        return language_manager\n\n\ndef _set_file_path(path):\n    \"\"\"Find the language_ids.json under the given path or the above it.\n    Intended to band aid the different paths returned in restored and continued training.\"\"\"\n    path_restore = os.path.join(os.path.dirname(path), \"language_ids.json\")\n    path_continue = os.path.join(path, \"language_ids.json\")\n    fs = fsspec.get_mapper(path).fs\n    if fs.exists(path_restore):\n        return path_restore\n    if fs.exists(path_continue):\n        return path_continue\n    return None\n\n\ndef get_language_balancer_weights(items: list):\n    language_names = np.array([item[\"language\"] for item in items])\n    unique_language_names = np.unique(language_names).tolist()\n    language_ids = [unique_language_names.index(l) for l in language_names]\n    language_count = np.array([len(np.where(language_names == l)[0]) for l in unique_language_names])\n    weight_language = 1.0 / language_count\n    # get weight for each sample\n    dataset_samples_weight = np.array([weight_language[l] for l in language_ids])\n    # normalize\n    dataset_samples_weight = dataset_samples_weight / np.linalg.norm(dataset_samples_weight)\n    return torch.from_numpy(dataset_samples_weight).float()\n"
  },
  {
    "path": "TTS/tts/utils/managers.py",
    "content": "import json\nimport random\nfrom typing import Any, Dict, List, Tuple, Union\n\nimport fsspec\nimport numpy as np\nimport torch\n\nfrom TTS.config import load_config\nfrom TTS.encoder.utils.generic_utils import setup_encoder_model\nfrom TTS.utils.audio import AudioProcessor\n\n\ndef load_file(path: str):\n    if path.endswith(\".json\"):\n        with fsspec.open(path, \"r\") as f:\n            return json.load(f)\n    elif path.endswith(\".pth\"):\n        with fsspec.open(path, \"rb\") as f:\n            return torch.load(f, map_location=\"cpu\")\n    else:\n        raise ValueError(\"Unsupported file type\")\n\n\ndef save_file(obj: Any, path: str):\n    if path.endswith(\".json\"):\n        with fsspec.open(path, \"w\") as f:\n            json.dump(obj, f, indent=4)\n    elif path.endswith(\".pth\"):\n        with fsspec.open(path, \"wb\") as f:\n            torch.save(obj, f)\n    else:\n        raise ValueError(\"Unsupported file type\")\n\n\nclass BaseIDManager:\n    \"\"\"Base `ID` Manager class. Every new `ID` manager must inherit this.\n    It defines common `ID` manager specific functions.\n    \"\"\"\n\n    def __init__(self, id_file_path: str = \"\"):\n        self.name_to_id = {}\n\n        if id_file_path:\n            self.load_ids_from_file(id_file_path)\n\n    @staticmethod\n    def _load_json(json_file_path: str) -> Dict:\n        with fsspec.open(json_file_path, \"r\") as f:\n            return json.load(f)\n\n    @staticmethod\n    def _save_json(json_file_path: str, data: dict) -> None:\n        with fsspec.open(json_file_path, \"w\") as f:\n            json.dump(data, f, indent=4)\n\n    def set_ids_from_data(self, items: List, parse_key: str) -> None:\n        \"\"\"Set IDs from data samples.\n\n        Args:\n            items (List): Data sampled returned by `load_tts_samples()`.\n        \"\"\"\n        self.name_to_id = self.parse_ids_from_data(items, parse_key=parse_key)\n\n    def load_ids_from_file(self, file_path: str) -> None:\n        \"\"\"Set IDs from a file.\n\n        Args:\n            file_path (str): Path to the file.\n        \"\"\"\n        self.name_to_id = load_file(file_path)\n\n    def save_ids_to_file(self, file_path: str) -> None:\n        \"\"\"Save IDs to a json file.\n\n        Args:\n            file_path (str): Path to the output file.\n        \"\"\"\n        save_file(self.name_to_id, file_path)\n\n    def get_random_id(self) -> Any:\n        \"\"\"Get a random embedding.\n\n        Args:\n\n        Returns:\n            np.ndarray: embedding.\n        \"\"\"\n        if self.name_to_id:\n            return self.name_to_id[random.choices(list(self.name_to_id.keys()))[0]]\n\n        return None\n\n    @staticmethod\n    def parse_ids_from_data(items: List, parse_key: str) -> Tuple[Dict]:\n        \"\"\"Parse IDs from data samples retured by `load_tts_samples()`.\n\n        Args:\n            items (list): Data sampled returned by `load_tts_samples()`.\n            parse_key (str): The key to being used to parse the data.\n        Returns:\n            Tuple[Dict]: speaker IDs.\n        \"\"\"\n        classes = sorted({item[parse_key] for item in items})\n        ids = {name: i for i, name in enumerate(classes)}\n        return ids\n\n\nclass EmbeddingManager(BaseIDManager):\n    \"\"\"Base `Embedding` Manager class. Every new `Embedding` manager must inherit this.\n    It defines common `Embedding` manager specific functions.\n\n    It expects embeddings files in the following format:\n\n    ::\n\n        {\n            'audio_file_key':{\n                'name': 'category_name',\n                'embedding'[<embedding_values>]\n            },\n            ...\n        }\n\n    `audio_file_key` is a unique key to the audio file in the dataset. It can be the path to the file or any other unique key.\n    `embedding` is the embedding vector of the audio file.\n    `name` can be name of the speaker of the audio file.\n    \"\"\"\n\n    def __init__(\n        self,\n        embedding_file_path: Union[str, List[str]] = \"\",\n        id_file_path: str = \"\",\n        encoder_model_path: str = \"\",\n        encoder_config_path: str = \"\",\n        use_cuda: bool = False,\n    ):\n        super().__init__(id_file_path=id_file_path)\n\n        self.embeddings = {}\n        self.embeddings_by_names = {}\n        self.clip_ids = []\n        self.encoder = None\n        self.encoder_ap = None\n        self.use_cuda = use_cuda\n\n        if embedding_file_path:\n            if isinstance(embedding_file_path, list):\n                self.load_embeddings_from_list_of_files(embedding_file_path)\n            else:\n                self.load_embeddings_from_file(embedding_file_path)\n\n        if encoder_model_path and encoder_config_path:\n            self.init_encoder(encoder_model_path, encoder_config_path, use_cuda)\n\n    @property\n    def num_embeddings(self):\n        \"\"\"Get number of embeddings.\"\"\"\n        return len(self.embeddings)\n\n    @property\n    def num_names(self):\n        \"\"\"Get number of embeddings.\"\"\"\n        return len(self.embeddings_by_names)\n\n    @property\n    def embedding_dim(self):\n        \"\"\"Dimensionality of embeddings. If embeddings are not loaded, returns zero.\"\"\"\n        if self.embeddings:\n            return len(self.embeddings[list(self.embeddings.keys())[0]][\"embedding\"])\n        return 0\n\n    @property\n    def embedding_names(self):\n        \"\"\"Get embedding names.\"\"\"\n        return list(self.embeddings_by_names.keys())\n\n    def save_embeddings_to_file(self, file_path: str) -> None:\n        \"\"\"Save embeddings to a json file.\n\n        Args:\n            file_path (str): Path to the output file.\n        \"\"\"\n        save_file(self.embeddings, file_path)\n\n    @staticmethod\n    def read_embeddings_from_file(file_path: str):\n        \"\"\"Load embeddings from a json file.\n\n        Args:\n            file_path (str): Path to the file.\n        \"\"\"\n        embeddings = load_file(file_path)\n        speakers = sorted({x[\"name\"] for x in embeddings.values()})\n        name_to_id = {name: i for i, name in enumerate(speakers)}\n        clip_ids = list(set(sorted(clip_name for clip_name in embeddings.keys())))\n        # cache embeddings_by_names for fast inference using a bigger speakers.json\n        embeddings_by_names = {}\n        for x in embeddings.values():\n            if x[\"name\"] not in embeddings_by_names.keys():\n                embeddings_by_names[x[\"name\"]] = [x[\"embedding\"]]\n            else:\n                embeddings_by_names[x[\"name\"]].append(x[\"embedding\"])\n        return name_to_id, clip_ids, embeddings, embeddings_by_names\n\n    def load_embeddings_from_file(self, file_path: str) -> None:\n        \"\"\"Load embeddings from a json file.\n\n        Args:\n            file_path (str): Path to the target json file.\n        \"\"\"\n        self.name_to_id, self.clip_ids, self.embeddings, self.embeddings_by_names = self.read_embeddings_from_file(\n            file_path\n        )\n\n    def load_embeddings_from_list_of_files(self, file_paths: List[str]) -> None:\n        \"\"\"Load embeddings from a list of json files and don't allow duplicate keys.\n\n        Args:\n            file_paths (List[str]): List of paths to the target json files.\n        \"\"\"\n        self.name_to_id = {}\n        self.clip_ids = []\n        self.embeddings_by_names = {}\n        self.embeddings = {}\n        for file_path in file_paths:\n            ids, clip_ids, embeddings, embeddings_by_names = self.read_embeddings_from_file(file_path)\n            # check colliding keys\n            duplicates = set(self.embeddings.keys()) & set(embeddings.keys())\n            if duplicates:\n                raise ValueError(f\" [!] Duplicate embedding names <{duplicates}> in {file_path}\")\n            # store values\n            self.name_to_id.update(ids)\n            self.clip_ids.extend(clip_ids)\n            self.embeddings_by_names.update(embeddings_by_names)\n            self.embeddings.update(embeddings)\n\n        # reset name_to_id to get the right speaker ids\n        self.name_to_id = {name: i for i, name in enumerate(self.name_to_id)}\n\n    def get_embedding_by_clip(self, clip_idx: str) -> List:\n        \"\"\"Get embedding by clip ID.\n\n        Args:\n            clip_idx (str): Target clip ID.\n\n        Returns:\n            List: embedding as a list.\n        \"\"\"\n        return self.embeddings[clip_idx][\"embedding\"]\n\n    def get_embeddings_by_name(self, idx: str) -> List[List]:\n        \"\"\"Get all embeddings of a speaker.\n\n        Args:\n            idx (str): Target name.\n\n        Returns:\n            List[List]: all the embeddings of the given speaker.\n        \"\"\"\n        return self.embeddings_by_names[idx]\n\n    def get_embeddings_by_names(self) -> Dict:\n        \"\"\"Get all embeddings by names.\n\n        Returns:\n            Dict: all the embeddings of each speaker.\n        \"\"\"\n        embeddings_by_names = {}\n        for x in self.embeddings.values():\n            if x[\"name\"] not in embeddings_by_names.keys():\n                embeddings_by_names[x[\"name\"]] = [x[\"embedding\"]]\n            else:\n                embeddings_by_names[x[\"name\"]].append(x[\"embedding\"])\n        return embeddings_by_names\n\n    def get_mean_embedding(self, idx: str, num_samples: int = None, randomize: bool = False) -> np.ndarray:\n        \"\"\"Get mean embedding of a idx.\n\n        Args:\n            idx (str): Target name.\n            num_samples (int, optional): Number of samples to be averaged. Defaults to None.\n            randomize (bool, optional): Pick random `num_samples` of embeddings. Defaults to False.\n\n        Returns:\n            np.ndarray: Mean embedding.\n        \"\"\"\n        embeddings = self.get_embeddings_by_name(idx)\n        if num_samples is None:\n            embeddings = np.stack(embeddings).mean(0)\n        else:\n            assert len(embeddings) >= num_samples, f\" [!] {idx} has number of samples < {num_samples}\"\n            if randomize:\n                embeddings = np.stack(random.choices(embeddings, k=num_samples)).mean(0)\n            else:\n                embeddings = np.stack(embeddings[:num_samples]).mean(0)\n        return embeddings\n\n    def get_random_embedding(self) -> Any:\n        \"\"\"Get a random embedding.\n\n        Args:\n\n        Returns:\n            np.ndarray: embedding.\n        \"\"\"\n        if self.embeddings:\n            return self.embeddings[random.choices(list(self.embeddings.keys()))[0]][\"embedding\"]\n\n        return None\n\n    def get_clips(self) -> List:\n        return sorted(self.embeddings.keys())\n\n    def init_encoder(self, model_path: str, config_path: str, use_cuda=False) -> None:\n        \"\"\"Initialize a speaker encoder model.\n\n        Args:\n            model_path (str): Model file path.\n            config_path (str): Model config file path.\n            use_cuda (bool, optional): Use CUDA. Defaults to False.\n        \"\"\"\n        self.use_cuda = use_cuda\n        self.encoder_config = load_config(config_path)\n        self.encoder = setup_encoder_model(self.encoder_config)\n        self.encoder_criterion = self.encoder.load_checkpoint(\n            self.encoder_config, model_path, eval=True, use_cuda=use_cuda, cache=True\n        )\n        self.encoder_ap = AudioProcessor(**self.encoder_config.audio)\n\n    def compute_embedding_from_clip(self, wav_file: Union[str, List[str]]) -> list:\n        \"\"\"Compute a embedding from a given audio file.\n\n        Args:\n            wav_file (Union[str, List[str]]): Target file path.\n\n        Returns:\n            list: Computed embedding.\n        \"\"\"\n\n        def _compute(wav_file: str):\n            waveform = self.encoder_ap.load_wav(wav_file, sr=self.encoder_ap.sample_rate)\n            if not self.encoder_config.model_params.get(\"use_torch_spec\", False):\n                m_input = self.encoder_ap.melspectrogram(waveform)\n                m_input = torch.from_numpy(m_input)\n            else:\n                m_input = torch.from_numpy(waveform)\n\n            if self.use_cuda:\n                m_input = m_input.cuda()\n            m_input = m_input.unsqueeze(0)\n            embedding = self.encoder.compute_embedding(m_input)\n            return embedding\n\n        if isinstance(wav_file, list):\n            # compute the mean embedding\n            embeddings = None\n            for wf in wav_file:\n                embedding = _compute(wf)\n                if embeddings is None:\n                    embeddings = embedding\n                else:\n                    embeddings += embedding\n            return (embeddings / len(wav_file))[0].tolist()\n        embedding = _compute(wav_file)\n        return embedding[0].tolist()\n\n    def compute_embeddings(self, feats: Union[torch.Tensor, np.ndarray]) -> List:\n        \"\"\"Compute embedding from features.\n\n        Args:\n            feats (Union[torch.Tensor, np.ndarray]): Input features.\n\n        Returns:\n            List: computed embedding.\n        \"\"\"\n        if isinstance(feats, np.ndarray):\n            feats = torch.from_numpy(feats)\n        if feats.ndim == 2:\n            feats = feats.unsqueeze(0)\n        if self.use_cuda:\n            feats = feats.cuda()\n        return self.encoder.compute_embedding(feats)\n"
  },
  {
    "path": "TTS/tts/utils/measures.py",
    "content": "def alignment_diagonal_score(alignments, binary=False):\n    \"\"\"\n    Compute how diagonal alignment predictions are. It is useful\n    to measure the alignment consistency of a model\n    Args:\n        alignments (torch.Tensor): batch of alignments.\n        binary (bool): if True, ignore scores and consider attention\n        as a binary mask.\n    Shape:\n        - alignments : :math:`[B, T_de, T_en]`\n    \"\"\"\n    maxs = alignments.max(dim=1)[0]\n    if binary:\n        maxs[maxs > 0] = 1\n    return maxs.mean(dim=1).mean(dim=0).item()\n"
  },
  {
    "path": "TTS/tts/utils/monotonic_align/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/utils/monotonic_align/core.c",
    "content": "/* Generated by Cython 0.29.28 */\n\n/* BEGIN: Cython Metadata\n{\n    \"distutils\": {\n        \"depends\": [],\n        \"name\": \"TTS.tts.utils.monotonic_align.core\",\n        \"sources\": [\n            \"TTS/tts/utils/monotonic_align/core.pyx\"\n        ]\n    },\n    \"module_name\": \"TTS.tts.utils.monotonic_align.core\"\n}\nEND: Cython Metadata */\n\n#ifndef PY_SSIZE_T_CLEAN\n#define PY_SSIZE_T_CLEAN\n#endif /* PY_SSIZE_T_CLEAN */\n#include \"Python.h\"\n#ifndef Py_PYTHON_H\n    #error Python headers needed to compile C extensions, please install development version of Python.\n#elif PY_VERSION_HEX < 0x02060000 || (0x03000000 <= PY_VERSION_HEX && PY_VERSION_HEX < 0x03030000)\n    #error Cython requires Python 2.6+ or Python 3.3+.\n#else\n#define CYTHON_ABI \"0_29_28\"\n#define CYTHON_HEX_VERSION 0x001D1CF0\n#define CYTHON_FUTURE_DIVISION 1\n#include <stddef.h>\n#ifndef offsetof\n  #define offsetof(type, member) ( (size_t) & ((type*)0) -> member )\n#endif\n#if !defined(WIN32) && !defined(MS_WINDOWS)\n  #ifndef __stdcall\n    #define __stdcall\n  #endif\n  #ifndef __cdecl\n    #define __cdecl\n  #endif\n  #ifndef __fastcall\n    #define __fastcall\n  #endif\n#endif\n#ifndef DL_IMPORT\n  #define DL_IMPORT(t) t\n#endif\n#ifndef DL_EXPORT\n  #define DL_EXPORT(t) t\n#endif\n#define __PYX_COMMA ,\n#ifndef HAVE_LONG_LONG\n  #if PY_VERSION_HEX >= 0x02070000\n    #define HAVE_LONG_LONG\n  #endif\n#endif\n#ifndef PY_LONG_LONG\n  #define PY_LONG_LONG LONG_LONG\n#endif\n#ifndef Py_HUGE_VAL\n  #define Py_HUGE_VAL HUGE_VAL\n#endif\n#ifdef PYPY_VERSION\n  #define CYTHON_COMPILING_IN_PYPY 1\n  #define CYTHON_COMPILING_IN_PYSTON 0\n  #define CYTHON_COMPILING_IN_CPYTHON 0\n  #undef CYTHON_USE_TYPE_SLOTS\n  #define CYTHON_USE_TYPE_SLOTS 0\n  #undef CYTHON_USE_PYTYPE_LOOKUP\n  #define CYTHON_USE_PYTYPE_LOOKUP 0\n  #if PY_VERSION_HEX < 0x03050000\n    #undef CYTHON_USE_ASYNC_SLOTS\n    #define CYTHON_USE_ASYNC_SLOTS 0\n  #elif !defined(CYTHON_USE_ASYNC_SLOTS)\n    #define CYTHON_USE_ASYNC_SLOTS 1\n  #endif\n  #undef CYTHON_USE_PYLIST_INTERNALS\n  #define CYTHON_USE_PYLIST_INTERNALS 0\n  #undef CYTHON_USE_UNICODE_INTERNALS\n  #define CYTHON_USE_UNICODE_INTERNALS 0\n  #undef CYTHON_USE_UNICODE_WRITER\n  #define CYTHON_USE_UNICODE_WRITER 0\n  #undef CYTHON_USE_PYLONG_INTERNALS\n  #define CYTHON_USE_PYLONG_INTERNALS 0\n  #undef CYTHON_AVOID_BORROWED_REFS\n  #define CYTHON_AVOID_BORROWED_REFS 1\n  #undef CYTHON_ASSUME_SAFE_MACROS\n  #define CYTHON_ASSUME_SAFE_MACROS 0\n  #undef CYTHON_UNPACK_METHODS\n  #define CYTHON_UNPACK_METHODS 0\n  #undef CYTHON_FAST_THREAD_STATE\n  #define CYTHON_FAST_THREAD_STATE 0\n  #undef CYTHON_FAST_PYCALL\n  #define CYTHON_FAST_PYCALL 0\n  #undef CYTHON_PEP489_MULTI_PHASE_INIT\n  #define CYTHON_PEP489_MULTI_PHASE_INIT 0\n  #undef CYTHON_USE_TP_FINALIZE\n  #define CYTHON_USE_TP_FINALIZE 0\n  #undef CYTHON_USE_DICT_VERSIONS\n  #define CYTHON_USE_DICT_VERSIONS 0\n  #undef CYTHON_USE_EXC_INFO_STACK\n  #define CYTHON_USE_EXC_INFO_STACK 0\n#elif defined(PYSTON_VERSION)\n  #define CYTHON_COMPILING_IN_PYPY 0\n  #define CYTHON_COMPILING_IN_PYSTON 1\n  #define CYTHON_COMPILING_IN_CPYTHON 0\n  #ifndef CYTHON_USE_TYPE_SLOTS\n    #define CYTHON_USE_TYPE_SLOTS 1\n  #endif\n  #undef CYTHON_USE_PYTYPE_LOOKUP\n  #define CYTHON_USE_PYTYPE_LOOKUP 0\n  #undef CYTHON_USE_ASYNC_SLOTS\n  #define CYTHON_USE_ASYNC_SLOTS 0\n  #undef CYTHON_USE_PYLIST_INTERNALS\n  #define CYTHON_USE_PYLIST_INTERNALS 0\n  #ifndef CYTHON_USE_UNICODE_INTERNALS\n    #define CYTHON_USE_UNICODE_INTERNALS 1\n  #endif\n  #undef CYTHON_USE_UNICODE_WRITER\n  #define CYTHON_USE_UNICODE_WRITER 0\n  #undef CYTHON_USE_PYLONG_INTERNALS\n  #define CYTHON_USE_PYLONG_INTERNALS 0\n  #ifndef CYTHON_AVOID_BORROWED_REFS\n    #define CYTHON_AVOID_BORROWED_REFS 0\n  #endif\n  #ifndef CYTHON_ASSUME_SAFE_MACROS\n    #define CYTHON_ASSUME_SAFE_MACROS 1\n  #endif\n  #ifndef CYTHON_UNPACK_METHODS\n    #define CYTHON_UNPACK_METHODS 1\n  #endif\n  #undef CYTHON_FAST_THREAD_STATE\n  #define CYTHON_FAST_THREAD_STATE 0\n  #undef CYTHON_FAST_PYCALL\n  #define CYTHON_FAST_PYCALL 0\n  #undef CYTHON_PEP489_MULTI_PHASE_INIT\n  #define CYTHON_PEP489_MULTI_PHASE_INIT 0\n  #undef CYTHON_USE_TP_FINALIZE\n  #define CYTHON_USE_TP_FINALIZE 0\n  #undef CYTHON_USE_DICT_VERSIONS\n  #define CYTHON_USE_DICT_VERSIONS 0\n  #undef CYTHON_USE_EXC_INFO_STACK\n  #define CYTHON_USE_EXC_INFO_STACK 0\n#else\n  #define CYTHON_COMPILING_IN_PYPY 0\n  #define CYTHON_COMPILING_IN_PYSTON 0\n  #define CYTHON_COMPILING_IN_CPYTHON 1\n  #ifndef CYTHON_USE_TYPE_SLOTS\n    #define CYTHON_USE_TYPE_SLOTS 1\n  #endif\n  #if PY_VERSION_HEX < 0x02070000\n    #undef CYTHON_USE_PYTYPE_LOOKUP\n    #define CYTHON_USE_PYTYPE_LOOKUP 0\n  #elif !defined(CYTHON_USE_PYTYPE_LOOKUP)\n    #define CYTHON_USE_PYTYPE_LOOKUP 1\n  #endif\n  #if PY_MAJOR_VERSION < 3\n    #undef CYTHON_USE_ASYNC_SLOTS\n    #define CYTHON_USE_ASYNC_SLOTS 0\n  #elif !defined(CYTHON_USE_ASYNC_SLOTS)\n    #define CYTHON_USE_ASYNC_SLOTS 1\n  #endif\n  #if PY_VERSION_HEX < 0x02070000\n    #undef CYTHON_USE_PYLONG_INTERNALS\n    #define CYTHON_USE_PYLONG_INTERNALS 0\n  #elif !defined(CYTHON_USE_PYLONG_INTERNALS)\n    #define CYTHON_USE_PYLONG_INTERNALS 1\n  #endif\n  #ifndef CYTHON_USE_PYLIST_INTERNALS\n    #define CYTHON_USE_PYLIST_INTERNALS 1\n  #endif\n  #ifndef CYTHON_USE_UNICODE_INTERNALS\n    #define CYTHON_USE_UNICODE_INTERNALS 1\n  #endif\n  #if PY_VERSION_HEX < 0x030300F0 || PY_VERSION_HEX >= 0x030B00A2\n    #undef CYTHON_USE_UNICODE_WRITER\n    #define CYTHON_USE_UNICODE_WRITER 0\n  #elif !defined(CYTHON_USE_UNICODE_WRITER)\n    #define CYTHON_USE_UNICODE_WRITER 1\n  #endif\n  #ifndef CYTHON_AVOID_BORROWED_REFS\n    #define CYTHON_AVOID_BORROWED_REFS 0\n  #endif\n  #ifndef CYTHON_ASSUME_SAFE_MACROS\n    #define CYTHON_ASSUME_SAFE_MACROS 1\n  #endif\n  #ifndef CYTHON_UNPACK_METHODS\n    #define CYTHON_UNPACK_METHODS 1\n  #endif\n  #if PY_VERSION_HEX >= 0x030B00A4\n    #undef CYTHON_FAST_THREAD_STATE\n    #define CYTHON_FAST_THREAD_STATE 0\n  #elif !defined(CYTHON_FAST_THREAD_STATE)\n    #define CYTHON_FAST_THREAD_STATE 1\n  #endif\n  #ifndef CYTHON_FAST_PYCALL\n    #define CYTHON_FAST_PYCALL (PY_VERSION_HEX < 0x030B00A1)\n  #endif\n  #ifndef CYTHON_PEP489_MULTI_PHASE_INIT\n    #define CYTHON_PEP489_MULTI_PHASE_INIT (PY_VERSION_HEX >= 0x03050000)\n  #endif\n  #ifndef CYTHON_USE_TP_FINALIZE\n    #define CYTHON_USE_TP_FINALIZE (PY_VERSION_HEX >= 0x030400a1)\n  #endif\n  #ifndef CYTHON_USE_DICT_VERSIONS\n    #define CYTHON_USE_DICT_VERSIONS (PY_VERSION_HEX >= 0x030600B1)\n  #endif\n  #if PY_VERSION_HEX >= 0x030B00A4\n    #undef CYTHON_USE_EXC_INFO_STACK\n    #define CYTHON_USE_EXC_INFO_STACK 0\n  #elif !defined(CYTHON_USE_EXC_INFO_STACK)\n    #define CYTHON_USE_EXC_INFO_STACK (PY_VERSION_HEX >= 0x030700A3)\n  #endif\n#endif\n#if !defined(CYTHON_FAST_PYCCALL)\n#define CYTHON_FAST_PYCCALL  (CYTHON_FAST_PYCALL && PY_VERSION_HEX >= 0x030600B1)\n#endif\n#if CYTHON_USE_PYLONG_INTERNALS\n  #if PY_MAJOR_VERSION < 3\n    #include \"longintrepr.h\"\n  #endif\n  #undef SHIFT\n  #undef BASE\n  #undef MASK\n  #ifdef SIZEOF_VOID_P\n    enum { __pyx_check_sizeof_voidp = 1 / (int)(SIZEOF_VOID_P == sizeof(void*)) };\n  #endif\n#endif\n#ifndef __has_attribute\n  #define __has_attribute(x) 0\n#endif\n#ifndef __has_cpp_attribute\n  #define __has_cpp_attribute(x) 0\n#endif\n#ifndef CYTHON_RESTRICT\n  #if defined(__GNUC__)\n    #define CYTHON_RESTRICT __restrict__\n  #elif defined(_MSC_VER) && _MSC_VER >= 1400\n    #define CYTHON_RESTRICT __restrict\n  #elif defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L\n    #define CYTHON_RESTRICT restrict\n  #else\n    #define CYTHON_RESTRICT\n  #endif\n#endif\n#ifndef CYTHON_UNUSED\n# if defined(__GNUC__)\n#   if !(defined(__cplusplus)) || (__GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 4))\n#     define CYTHON_UNUSED __attribute__ ((__unused__))\n#   else\n#     define CYTHON_UNUSED\n#   endif\n# elif defined(__ICC) || (defined(__INTEL_COMPILER) && !defined(_MSC_VER))\n#   define CYTHON_UNUSED __attribute__ ((__unused__))\n# else\n#   define CYTHON_UNUSED\n# endif\n#endif\n#ifndef CYTHON_MAYBE_UNUSED_VAR\n#  if defined(__cplusplus)\n     template<class T> void CYTHON_MAYBE_UNUSED_VAR( const T& ) { }\n#  else\n#    define CYTHON_MAYBE_UNUSED_VAR(x) (void)(x)\n#  endif\n#endif\n#ifndef CYTHON_NCP_UNUSED\n# if CYTHON_COMPILING_IN_CPYTHON\n#  define CYTHON_NCP_UNUSED\n# else\n#  define CYTHON_NCP_UNUSED CYTHON_UNUSED\n# endif\n#endif\n#define __Pyx_void_to_None(void_result) ((void)(void_result), Py_INCREF(Py_None), Py_None)\n#ifdef _MSC_VER\n    #ifndef _MSC_STDINT_H_\n        #if _MSC_VER < 1300\n           typedef unsigned char     uint8_t;\n           typedef unsigned int      uint32_t;\n        #else\n           typedef unsigned __int8   uint8_t;\n           typedef unsigned __int32  uint32_t;\n        #endif\n    #endif\n#else\n   #include <stdint.h>\n#endif\n#ifndef CYTHON_FALLTHROUGH\n  #if defined(__cplusplus) && __cplusplus >= 201103L\n    #if __has_cpp_attribute(fallthrough)\n      #define CYTHON_FALLTHROUGH [[fallthrough]]\n    #elif __has_cpp_attribute(clang::fallthrough)\n      #define CYTHON_FALLTHROUGH [[clang::fallthrough]]\n    #elif __has_cpp_attribute(gnu::fallthrough)\n      #define CYTHON_FALLTHROUGH [[gnu::fallthrough]]\n    #endif\n  #endif\n  #ifndef CYTHON_FALLTHROUGH\n    #if __has_attribute(fallthrough)\n      #define CYTHON_FALLTHROUGH __attribute__((fallthrough))\n    #else\n      #define CYTHON_FALLTHROUGH\n    #endif\n  #endif\n  #if defined(__clang__ ) && defined(__apple_build_version__)\n    #if __apple_build_version__ < 7000000\n      #undef  CYTHON_FALLTHROUGH\n      #define CYTHON_FALLTHROUGH\n    #endif\n  #endif\n#endif\n\n#ifndef CYTHON_INLINE\n  #if defined(__clang__)\n    #define CYTHON_INLINE __inline__ __attribute__ ((__unused__))\n  #elif defined(__GNUC__)\n    #define CYTHON_INLINE __inline__\n  #elif defined(_MSC_VER)\n    #define CYTHON_INLINE __inline\n  #elif defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L\n    #define CYTHON_INLINE inline\n  #else\n    #define CYTHON_INLINE\n  #endif\n#endif\n\n#if CYTHON_COMPILING_IN_PYPY && PY_VERSION_HEX < 0x02070600 && !defined(Py_OptimizeFlag)\n  #define Py_OptimizeFlag 0\n#endif\n#define __PYX_BUILD_PY_SSIZE_T \"n\"\n#define CYTHON_FORMAT_SSIZE_T \"z\"\n#if PY_MAJOR_VERSION < 3\n  #define __Pyx_BUILTIN_MODULE_NAME \"__builtin__\"\n  #define __Pyx_PyCode_New(a, k, l, s, f, code, c, n, v, fv, cell, fn, name, fline, lnos)\\\n          PyCode_New(a+k, l, s, f, code, c, n, v, fv, cell, fn, name, fline, lnos)\n  #define __Pyx_DefaultClassType PyClass_Type\n#else\n  #define __Pyx_BUILTIN_MODULE_NAME \"builtins\"\n  #define __Pyx_DefaultClassType PyType_Type\n#if PY_VERSION_HEX >= 0x030B00A1\n    static CYTHON_INLINE PyCodeObject* __Pyx_PyCode_New(int a, int k, int l, int s, int f,\n                                                    PyObject *code, PyObject *c, PyObject* n, PyObject *v,\n                                                    PyObject *fv, PyObject *cell, PyObject* fn,\n                                                    PyObject *name, int fline, PyObject *lnos) {\n        PyObject *kwds=NULL, *argcount=NULL, *posonlyargcount=NULL, *kwonlyargcount=NULL;\n        PyObject *nlocals=NULL, *stacksize=NULL, *flags=NULL, *replace=NULL, *call_result=NULL, *empty=NULL;\n        const char *fn_cstr=NULL;\n        const char *name_cstr=NULL;\n        PyCodeObject* co=NULL;\n        PyObject *type, *value, *traceback;\n        PyErr_Fetch(&type, &value, &traceback);\n        if (!(kwds=PyDict_New())) goto end;\n        if (!(argcount=PyLong_FromLong(a))) goto end;\n        if (PyDict_SetItemString(kwds, \"co_argcount\", argcount) != 0) goto end;\n        if (!(posonlyargcount=PyLong_FromLong(0))) goto end;\n        if (PyDict_SetItemString(kwds, \"co_posonlyargcount\", posonlyargcount) != 0) goto end;\n        if (!(kwonlyargcount=PyLong_FromLong(k))) goto end;\n        if (PyDict_SetItemString(kwds, \"co_kwonlyargcount\", kwonlyargcount) != 0) goto end;\n        if (!(nlocals=PyLong_FromLong(l))) goto end;\n        if (PyDict_SetItemString(kwds, \"co_nlocals\", nlocals) != 0) goto end;\n        if (!(stacksize=PyLong_FromLong(s))) goto end;\n        if (PyDict_SetItemString(kwds, \"co_stacksize\", stacksize) != 0) goto end;\n        if (!(flags=PyLong_FromLong(f))) goto end;\n        if (PyDict_SetItemString(kwds, \"co_flags\", flags) != 0) goto end;\n        if (PyDict_SetItemString(kwds, \"co_code\", code) != 0) goto end;\n        if (PyDict_SetItemString(kwds, \"co_consts\", c) != 0) goto end;\n        if (PyDict_SetItemString(kwds, \"co_names\", n) != 0) goto end;\n        if (PyDict_SetItemString(kwds, \"co_varnames\", v) != 0) goto end;\n        if (PyDict_SetItemString(kwds, \"co_freevars\", fv) != 0) goto end;\n        if (PyDict_SetItemString(kwds, \"co_cellvars\", cell) != 0) goto end;\n        if (PyDict_SetItemString(kwds, \"co_linetable\", lnos) != 0) goto end;\n        if (!(fn_cstr=PyUnicode_AsUTF8AndSize(fn, NULL))) goto end;\n        if (!(name_cstr=PyUnicode_AsUTF8AndSize(name, NULL))) goto end;\n        if (!(co = PyCode_NewEmpty(fn_cstr, name_cstr, fline))) goto end;\n        if (!(replace = PyObject_GetAttrString((PyObject*)co, \"replace\"))) goto cleanup_code_too;\n        if (!(empty = PyTuple_New(0))) goto cleanup_code_too; // unfortunately __pyx_empty_tuple isn't available here\n        if (!(call_result = PyObject_Call(replace, empty, kwds))) goto cleanup_code_too;\n        Py_XDECREF((PyObject*)co);\n        co = (PyCodeObject*)call_result;\n        call_result = NULL;\n        if (0) {\n            cleanup_code_too:\n            Py_XDECREF((PyObject*)co);\n            co = NULL;\n        }\n        end:\n        Py_XDECREF(kwds);\n        Py_XDECREF(argcount);\n        Py_XDECREF(posonlyargcount);\n        Py_XDECREF(kwonlyargcount);\n        Py_XDECREF(nlocals);\n        Py_XDECREF(stacksize);\n        Py_XDECREF(replace);\n        Py_XDECREF(call_result);\n        Py_XDECREF(empty);\n        if (type) {\n            PyErr_Restore(type, value, traceback);\n        }\n        return co;\n    }\n#else\n  #define __Pyx_PyCode_New(a, k, l, s, f, code, c, n, v, fv, cell, fn, name, fline, lnos)\\\n          PyCode_New(a, k, l, s, f, code, c, n, v, fv, cell, fn, name, fline, lnos)\n#endif\n  #define __Pyx_DefaultClassType PyType_Type\n#endif\n#ifndef Py_TPFLAGS_CHECKTYPES\n  #define Py_TPFLAGS_CHECKTYPES 0\n#endif\n#ifndef Py_TPFLAGS_HAVE_INDEX\n  #define Py_TPFLAGS_HAVE_INDEX 0\n#endif\n#ifndef Py_TPFLAGS_HAVE_NEWBUFFER\n  #define Py_TPFLAGS_HAVE_NEWBUFFER 0\n#endif\n#ifndef Py_TPFLAGS_HAVE_FINALIZE\n  #define Py_TPFLAGS_HAVE_FINALIZE 0\n#endif\n#ifndef METH_STACKLESS\n  #define METH_STACKLESS 0\n#endif\n#if PY_VERSION_HEX <= 0x030700A3 || !defined(METH_FASTCALL)\n  #ifndef METH_FASTCALL\n     #define METH_FASTCALL 0x80\n  #endif\n  typedef PyObject *(*__Pyx_PyCFunctionFast) (PyObject *self, PyObject *const *args, Py_ssize_t nargs);\n  typedef PyObject *(*__Pyx_PyCFunctionFastWithKeywords) (PyObject *self, PyObject *const *args,\n                                                          Py_ssize_t nargs, PyObject *kwnames);\n#else\n  #define __Pyx_PyCFunctionFast _PyCFunctionFast\n  #define __Pyx_PyCFunctionFastWithKeywords _PyCFunctionFastWithKeywords\n#endif\n#if CYTHON_FAST_PYCCALL\n#define __Pyx_PyFastCFunction_Check(func)\\\n    ((PyCFunction_Check(func) && (METH_FASTCALL == (PyCFunction_GET_FLAGS(func) & ~(METH_CLASS | METH_STATIC | METH_COEXIST | METH_KEYWORDS | METH_STACKLESS)))))\n#else\n#define __Pyx_PyFastCFunction_Check(func) 0\n#endif\n#if CYTHON_COMPILING_IN_PYPY && !defined(PyObject_Malloc)\n  #define PyObject_Malloc(s)   PyMem_Malloc(s)\n  #define PyObject_Free(p)     PyMem_Free(p)\n  #define PyObject_Realloc(p)  PyMem_Realloc(p)\n#endif\n#if CYTHON_COMPILING_IN_CPYTHON && PY_VERSION_HEX < 0x030400A1\n  #define PyMem_RawMalloc(n)           PyMem_Malloc(n)\n  #define PyMem_RawRealloc(p, n)       PyMem_Realloc(p, n)\n  #define PyMem_RawFree(p)             PyMem_Free(p)\n#endif\n#if CYTHON_COMPILING_IN_PYSTON\n  #define __Pyx_PyCode_HasFreeVars(co)  PyCode_HasFreeVars(co)\n  #define __Pyx_PyFrame_SetLineNumber(frame, lineno) PyFrame_SetLineNumber(frame, lineno)\n#else\n  #define __Pyx_PyCode_HasFreeVars(co)  (PyCode_GetNumFree(co) > 0)\n  #define __Pyx_PyFrame_SetLineNumber(frame, lineno)  (frame)->f_lineno = (lineno)\n#endif\n#if !CYTHON_FAST_THREAD_STATE || PY_VERSION_HEX < 0x02070000\n  #define __Pyx_PyThreadState_Current PyThreadState_GET()\n#elif PY_VERSION_HEX >= 0x03060000\n  #define __Pyx_PyThreadState_Current _PyThreadState_UncheckedGet()\n#elif PY_VERSION_HEX >= 0x03000000\n  #define __Pyx_PyThreadState_Current PyThreadState_GET()\n#else\n  #define __Pyx_PyThreadState_Current _PyThreadState_Current\n#endif\n#if PY_VERSION_HEX < 0x030700A2 && !defined(PyThread_tss_create) && !defined(Py_tss_NEEDS_INIT)\n#include \"pythread.h\"\n#define Py_tss_NEEDS_INIT 0\ntypedef int Py_tss_t;\nstatic CYTHON_INLINE int PyThread_tss_create(Py_tss_t *key) {\n  *key = PyThread_create_key();\n  return 0;\n}\nstatic CYTHON_INLINE Py_tss_t * PyThread_tss_alloc(void) {\n  Py_tss_t *key = (Py_tss_t *)PyObject_Malloc(sizeof(Py_tss_t));\n  *key = Py_tss_NEEDS_INIT;\n  return key;\n}\nstatic CYTHON_INLINE void PyThread_tss_free(Py_tss_t *key) {\n  PyObject_Free(key);\n}\nstatic CYTHON_INLINE int PyThread_tss_is_created(Py_tss_t *key) {\n  return *key != Py_tss_NEEDS_INIT;\n}\nstatic CYTHON_INLINE void PyThread_tss_delete(Py_tss_t *key) {\n  PyThread_delete_key(*key);\n  *key = Py_tss_NEEDS_INIT;\n}\nstatic CYTHON_INLINE int PyThread_tss_set(Py_tss_t *key, void *value) {\n  return PyThread_set_key_value(*key, value);\n}\nstatic CYTHON_INLINE void * PyThread_tss_get(Py_tss_t *key) {\n  return PyThread_get_key_value(*key);\n}\n#endif\n#if CYTHON_COMPILING_IN_CPYTHON || defined(_PyDict_NewPresized)\n#define __Pyx_PyDict_NewPresized(n)  ((n <= 8) ? PyDict_New() : _PyDict_NewPresized(n))\n#else\n#define __Pyx_PyDict_NewPresized(n)  PyDict_New()\n#endif\n#if PY_MAJOR_VERSION >= 3 || CYTHON_FUTURE_DIVISION\n  #define __Pyx_PyNumber_Divide(x,y)         PyNumber_TrueDivide(x,y)\n  #define __Pyx_PyNumber_InPlaceDivide(x,y)  PyNumber_InPlaceTrueDivide(x,y)\n#else\n  #define __Pyx_PyNumber_Divide(x,y)         PyNumber_Divide(x,y)\n  #define __Pyx_PyNumber_InPlaceDivide(x,y)  PyNumber_InPlaceDivide(x,y)\n#endif\n#if CYTHON_COMPILING_IN_CPYTHON && PY_VERSION_HEX >= 0x030500A1 && CYTHON_USE_UNICODE_INTERNALS\n#define __Pyx_PyDict_GetItemStr(dict, name)  _PyDict_GetItem_KnownHash(dict, name, ((PyASCIIObject *) name)->hash)\n#else\n#define __Pyx_PyDict_GetItemStr(dict, name)  PyDict_GetItem(dict, name)\n#endif\n#if PY_VERSION_HEX > 0x03030000 && defined(PyUnicode_KIND)\n  #define CYTHON_PEP393_ENABLED 1\n  #if defined(PyUnicode_IS_READY)\n  #define __Pyx_PyUnicode_READY(op)       (likely(PyUnicode_IS_READY(op)) ?\\\n                                              0 : _PyUnicode_Ready((PyObject *)(op)))\n  #else\n  #define __Pyx_PyUnicode_READY(op)       (0)\n  #endif\n  #define __Pyx_PyUnicode_GET_LENGTH(u)   PyUnicode_GET_LENGTH(u)\n  #define __Pyx_PyUnicode_READ_CHAR(u, i) PyUnicode_READ_CHAR(u, i)\n  #define __Pyx_PyUnicode_MAX_CHAR_VALUE(u)   PyUnicode_MAX_CHAR_VALUE(u)\n  #define __Pyx_PyUnicode_KIND(u)         PyUnicode_KIND(u)\n  #define __Pyx_PyUnicode_DATA(u)         PyUnicode_DATA(u)\n  #define __Pyx_PyUnicode_READ(k, d, i)   PyUnicode_READ(k, d, i)\n  #define __Pyx_PyUnicode_WRITE(k, d, i, ch)  PyUnicode_WRITE(k, d, i, ch)\n  #if defined(PyUnicode_IS_READY) && defined(PyUnicode_GET_SIZE)\n  #if CYTHON_COMPILING_IN_CPYTHON && PY_VERSION_HEX >= 0x03090000\n  #define __Pyx_PyUnicode_IS_TRUE(u)      (0 != (likely(PyUnicode_IS_READY(u)) ? PyUnicode_GET_LENGTH(u) : ((PyCompactUnicodeObject *)(u))->wstr_length))\n  #else\n  #define __Pyx_PyUnicode_IS_TRUE(u)      (0 != (likely(PyUnicode_IS_READY(u)) ? PyUnicode_GET_LENGTH(u) : PyUnicode_GET_SIZE(u)))\n  #endif\n  #else\n  #define __Pyx_PyUnicode_IS_TRUE(u)      (0 != PyUnicode_GET_LENGTH(u))\n  #endif\n#else\n  #define CYTHON_PEP393_ENABLED 0\n  #define PyUnicode_1BYTE_KIND  1\n  #define PyUnicode_2BYTE_KIND  2\n  #define PyUnicode_4BYTE_KIND  4\n  #define __Pyx_PyUnicode_READY(op)       (0)\n  #define __Pyx_PyUnicode_GET_LENGTH(u)   PyUnicode_GET_SIZE(u)\n  #define __Pyx_PyUnicode_READ_CHAR(u, i) ((Py_UCS4)(PyUnicode_AS_UNICODE(u)[i]))\n  #define __Pyx_PyUnicode_MAX_CHAR_VALUE(u)   ((sizeof(Py_UNICODE) == 2) ? 65535 : 1114111)\n  #define __Pyx_PyUnicode_KIND(u)         (sizeof(Py_UNICODE))\n  #define __Pyx_PyUnicode_DATA(u)         ((void*)PyUnicode_AS_UNICODE(u))\n  #define __Pyx_PyUnicode_READ(k, d, i)   ((void)(k), (Py_UCS4)(((Py_UNICODE*)d)[i]))\n  #define __Pyx_PyUnicode_WRITE(k, d, i, ch)  (((void)(k)), ((Py_UNICODE*)d)[i] = ch)\n  #define __Pyx_PyUnicode_IS_TRUE(u)      (0 != PyUnicode_GET_SIZE(u))\n#endif\n#if CYTHON_COMPILING_IN_PYPY\n  #define __Pyx_PyUnicode_Concat(a, b)      PyNumber_Add(a, b)\n  #define __Pyx_PyUnicode_ConcatSafe(a, b)  PyNumber_Add(a, b)\n#else\n  #define __Pyx_PyUnicode_Concat(a, b)      PyUnicode_Concat(a, b)\n  #define __Pyx_PyUnicode_ConcatSafe(a, b)  ((unlikely((a) == Py_None) || unlikely((b) == Py_None)) ?\\\n      PyNumber_Add(a, b) : __Pyx_PyUnicode_Concat(a, b))\n#endif\n#if CYTHON_COMPILING_IN_PYPY && !defined(PyUnicode_Contains)\n  #define PyUnicode_Contains(u, s)  PySequence_Contains(u, s)\n#endif\n#if CYTHON_COMPILING_IN_PYPY && !defined(PyByteArray_Check)\n  #define PyByteArray_Check(obj)  PyObject_TypeCheck(obj, &PyByteArray_Type)\n#endif\n#if CYTHON_COMPILING_IN_PYPY && !defined(PyObject_Format)\n  #define PyObject_Format(obj, fmt)  PyObject_CallMethod(obj, \"__format__\", \"O\", fmt)\n#endif\n#define __Pyx_PyString_FormatSafe(a, b)   ((unlikely((a) == Py_None || (PyString_Check(b) && !PyString_CheckExact(b)))) ? PyNumber_Remainder(a, b) : __Pyx_PyString_Format(a, b))\n#define __Pyx_PyUnicode_FormatSafe(a, b)  ((unlikely((a) == Py_None || (PyUnicode_Check(b) && !PyUnicode_CheckExact(b)))) ? PyNumber_Remainder(a, b) : PyUnicode_Format(a, b))\n#if PY_MAJOR_VERSION >= 3\n  #define __Pyx_PyString_Format(a, b)  PyUnicode_Format(a, b)\n#else\n  #define __Pyx_PyString_Format(a, b)  PyString_Format(a, b)\n#endif\n#if PY_MAJOR_VERSION < 3 && !defined(PyObject_ASCII)\n  #define PyObject_ASCII(o)            PyObject_Repr(o)\n#endif\n#if PY_MAJOR_VERSION >= 3\n  #define PyBaseString_Type            PyUnicode_Type\n  #define PyStringObject               PyUnicodeObject\n  #define PyString_Type                PyUnicode_Type\n  #define PyString_Check               PyUnicode_Check\n  #define PyString_CheckExact          PyUnicode_CheckExact\n#ifndef PyObject_Unicode\n  #define PyObject_Unicode             PyObject_Str\n#endif\n#endif\n#if PY_MAJOR_VERSION >= 3\n  #define __Pyx_PyBaseString_Check(obj) PyUnicode_Check(obj)\n  #define __Pyx_PyBaseString_CheckExact(obj) PyUnicode_CheckExact(obj)\n#else\n  #define __Pyx_PyBaseString_Check(obj) (PyString_Check(obj) || PyUnicode_Check(obj))\n  #define __Pyx_PyBaseString_CheckExact(obj) (PyString_CheckExact(obj) || PyUnicode_CheckExact(obj))\n#endif\n#ifndef PySet_CheckExact\n  #define PySet_CheckExact(obj)        (Py_TYPE(obj) == &PySet_Type)\n#endif\n#if PY_VERSION_HEX >= 0x030900A4\n  #define __Pyx_SET_REFCNT(obj, refcnt) Py_SET_REFCNT(obj, refcnt)\n  #define __Pyx_SET_SIZE(obj, size) Py_SET_SIZE(obj, size)\n#else\n  #define __Pyx_SET_REFCNT(obj, refcnt) Py_REFCNT(obj) = (refcnt)\n  #define __Pyx_SET_SIZE(obj, size) Py_SIZE(obj) = (size)\n#endif\n#if CYTHON_ASSUME_SAFE_MACROS\n  #define __Pyx_PySequence_SIZE(seq)  Py_SIZE(seq)\n#else\n  #define __Pyx_PySequence_SIZE(seq)  PySequence_Size(seq)\n#endif\n#if PY_MAJOR_VERSION >= 3\n  #define PyIntObject                  PyLongObject\n  #define PyInt_Type                   PyLong_Type\n  #define PyInt_Check(op)              PyLong_Check(op)\n  #define PyInt_CheckExact(op)         PyLong_CheckExact(op)\n  #define PyInt_FromString             PyLong_FromString\n  #define PyInt_FromUnicode            PyLong_FromUnicode\n  #define PyInt_FromLong               PyLong_FromLong\n  #define PyInt_FromSize_t             PyLong_FromSize_t\n  #define PyInt_FromSsize_t            PyLong_FromSsize_t\n  #define PyInt_AsLong                 PyLong_AsLong\n  #define PyInt_AS_LONG                PyLong_AS_LONG\n  #define PyInt_AsSsize_t              PyLong_AsSsize_t\n  #define PyInt_AsUnsignedLongMask     PyLong_AsUnsignedLongMask\n  #define PyInt_AsUnsignedLongLongMask PyLong_AsUnsignedLongLongMask\n  #define PyNumber_Int                 PyNumber_Long\n#endif\n#if PY_MAJOR_VERSION >= 3\n  #define PyBoolObject                 PyLongObject\n#endif\n#if PY_MAJOR_VERSION >= 3 && CYTHON_COMPILING_IN_PYPY\n  #ifndef PyUnicode_InternFromString\n    #define PyUnicode_InternFromString(s) PyUnicode_FromString(s)\n  #endif\n#endif\n#if PY_VERSION_HEX < 0x030200A4\n  typedef long Py_hash_t;\n  #define __Pyx_PyInt_FromHash_t PyInt_FromLong\n  #define __Pyx_PyInt_AsHash_t   __Pyx_PyIndex_AsHash_t\n#else\n  #define __Pyx_PyInt_FromHash_t PyInt_FromSsize_t\n  #define __Pyx_PyInt_AsHash_t   __Pyx_PyIndex_AsSsize_t\n#endif\n#if PY_MAJOR_VERSION >= 3\n  #define __Pyx_PyMethod_New(func, self, klass) ((self) ? ((void)(klass), PyMethod_New(func, self)) : __Pyx_NewRef(func))\n#else\n  #define __Pyx_PyMethod_New(func, self, klass) PyMethod_New(func, self, klass)\n#endif\n#if CYTHON_USE_ASYNC_SLOTS\n  #if PY_VERSION_HEX >= 0x030500B1\n    #define __Pyx_PyAsyncMethodsStruct PyAsyncMethods\n    #define __Pyx_PyType_AsAsync(obj) (Py_TYPE(obj)->tp_as_async)\n  #else\n    #define __Pyx_PyType_AsAsync(obj) ((__Pyx_PyAsyncMethodsStruct*) (Py_TYPE(obj)->tp_reserved))\n  #endif\n#else\n  #define __Pyx_PyType_AsAsync(obj) NULL\n#endif\n#ifndef __Pyx_PyAsyncMethodsStruct\n    typedef struct {\n        unaryfunc am_await;\n        unaryfunc am_aiter;\n        unaryfunc am_anext;\n    } __Pyx_PyAsyncMethodsStruct;\n#endif\n\n#if defined(WIN32) || defined(MS_WINDOWS)\n  #define _USE_MATH_DEFINES\n#endif\n#include <math.h>\n#ifdef NAN\n#define __PYX_NAN() ((float) NAN)\n#else\nstatic CYTHON_INLINE float __PYX_NAN() {\n  float value;\n  memset(&value, 0xFF, sizeof(value));\n  return value;\n}\n#endif\n#if defined(__CYGWIN__) && defined(_LDBL_EQ_DBL)\n#define __Pyx_truncl trunc\n#else\n#define __Pyx_truncl truncl\n#endif\n\n#define __PYX_MARK_ERR_POS(f_index, lineno) \\\n    { __pyx_filename = __pyx_f[f_index]; (void)__pyx_filename; __pyx_lineno = lineno; (void)__pyx_lineno; __pyx_clineno = __LINE__; (void)__pyx_clineno; }\n#define __PYX_ERR(f_index, lineno, Ln_error) \\\n    { __PYX_MARK_ERR_POS(f_index, lineno) goto Ln_error; }\n\n#ifndef __PYX_EXTERN_C\n  #ifdef __cplusplus\n    #define __PYX_EXTERN_C extern \"C\"\n  #else\n    #define __PYX_EXTERN_C extern\n  #endif\n#endif\n\n#define __PYX_HAVE__TTS__tts__utils__monotonic_align__core\n#define __PYX_HAVE_API__TTS__tts__utils__monotonic_align__core\n/* Early includes */\n#include <string.h>\n#include <stdio.h>\n#include \"numpy/arrayobject.h\"\n#include \"numpy/ndarrayobject.h\"\n#include \"numpy/ndarraytypes.h\"\n#include \"numpy/arrayscalars.h\"\n#include \"numpy/ufuncobject.h\"\n\n    /* NumPy API declarations from \"numpy/__init__.pxd\" */\n    \n#include \"pythread.h\"\n#include <stdlib.h>\n#include \"pystate.h\"\n#ifdef _OPENMP\n#include <omp.h>\n#endif /* _OPENMP */\n\n#if defined(PYREX_WITHOUT_ASSERTIONS) && !defined(CYTHON_WITHOUT_ASSERTIONS)\n#define CYTHON_WITHOUT_ASSERTIONS\n#endif\n\ntypedef struct {PyObject **p; const char *s; const Py_ssize_t n; const char* encoding;\n                const char is_unicode; const char is_str; const char intern; } __Pyx_StringTabEntry;\n\n#define __PYX_DEFAULT_STRING_ENCODING_IS_ASCII 0\n#define __PYX_DEFAULT_STRING_ENCODING_IS_UTF8 0\n#define __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT (PY_MAJOR_VERSION >= 3 && __PYX_DEFAULT_STRING_ENCODING_IS_UTF8)\n#define __PYX_DEFAULT_STRING_ENCODING \"\"\n#define __Pyx_PyObject_FromString __Pyx_PyBytes_FromString\n#define __Pyx_PyObject_FromStringAndSize __Pyx_PyBytes_FromStringAndSize\n#define __Pyx_uchar_cast(c) ((unsigned char)c)\n#define __Pyx_long_cast(x) ((long)x)\n#define __Pyx_fits_Py_ssize_t(v, type, is_signed)  (\\\n    (sizeof(type) < sizeof(Py_ssize_t))  ||\\\n    (sizeof(type) > sizeof(Py_ssize_t) &&\\\n          likely(v < (type)PY_SSIZE_T_MAX ||\\\n                 v == (type)PY_SSIZE_T_MAX)  &&\\\n          (!is_signed || likely(v > (type)PY_SSIZE_T_MIN ||\\\n                                v == (type)PY_SSIZE_T_MIN)))  ||\\\n    (sizeof(type) == sizeof(Py_ssize_t) &&\\\n          (is_signed || likely(v < (type)PY_SSIZE_T_MAX ||\\\n                               v == (type)PY_SSIZE_T_MAX)))  )\nstatic CYTHON_INLINE int __Pyx_is_valid_index(Py_ssize_t i, Py_ssize_t limit) {\n    return (size_t) i < (size_t) limit;\n}\n#if defined (__cplusplus) && __cplusplus >= 201103L\n    #include <cstdlib>\n    #define __Pyx_sst_abs(value) std::abs(value)\n#elif SIZEOF_INT >= SIZEOF_SIZE_T\n    #define __Pyx_sst_abs(value) abs(value)\n#elif SIZEOF_LONG >= SIZEOF_SIZE_T\n    #define __Pyx_sst_abs(value) labs(value)\n#elif defined (_MSC_VER)\n    #define __Pyx_sst_abs(value) ((Py_ssize_t)_abs64(value))\n#elif defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L\n    #define __Pyx_sst_abs(value) llabs(value)\n#elif defined (__GNUC__)\n    #define __Pyx_sst_abs(value) __builtin_llabs(value)\n#else\n    #define __Pyx_sst_abs(value) ((value<0) ? -value : value)\n#endif\nstatic CYTHON_INLINE const char* __Pyx_PyObject_AsString(PyObject*);\nstatic CYTHON_INLINE const char* __Pyx_PyObject_AsStringAndSize(PyObject*, Py_ssize_t* length);\n#define __Pyx_PyByteArray_FromString(s) PyByteArray_FromStringAndSize((const char*)s, strlen((const char*)s))\n#define __Pyx_PyByteArray_FromStringAndSize(s, l) PyByteArray_FromStringAndSize((const char*)s, l)\n#define __Pyx_PyBytes_FromString        PyBytes_FromString\n#define __Pyx_PyBytes_FromStringAndSize PyBytes_FromStringAndSize\nstatic CYTHON_INLINE PyObject* __Pyx_PyUnicode_FromString(const char*);\n#if PY_MAJOR_VERSION < 3\n    #define __Pyx_PyStr_FromString        __Pyx_PyBytes_FromString\n    #define __Pyx_PyStr_FromStringAndSize __Pyx_PyBytes_FromStringAndSize\n#else\n    #define __Pyx_PyStr_FromString        __Pyx_PyUnicode_FromString\n    #define __Pyx_PyStr_FromStringAndSize __Pyx_PyUnicode_FromStringAndSize\n#endif\n#define __Pyx_PyBytes_AsWritableString(s)     ((char*) PyBytes_AS_STRING(s))\n#define __Pyx_PyBytes_AsWritableSString(s)    ((signed char*) PyBytes_AS_STRING(s))\n#define __Pyx_PyBytes_AsWritableUString(s)    ((unsigned char*) PyBytes_AS_STRING(s))\n#define __Pyx_PyBytes_AsString(s)     ((const char*) PyBytes_AS_STRING(s))\n#define __Pyx_PyBytes_AsSString(s)    ((const signed char*) PyBytes_AS_STRING(s))\n#define __Pyx_PyBytes_AsUString(s)    ((const unsigned char*) PyBytes_AS_STRING(s))\n#define __Pyx_PyObject_AsWritableString(s)    ((char*) __Pyx_PyObject_AsString(s))\n#define __Pyx_PyObject_AsWritableSString(s)    ((signed char*) __Pyx_PyObject_AsString(s))\n#define __Pyx_PyObject_AsWritableUString(s)    ((unsigned char*) __Pyx_PyObject_AsString(s))\n#define __Pyx_PyObject_AsSString(s)    ((const signed char*) __Pyx_PyObject_AsString(s))\n#define __Pyx_PyObject_AsUString(s)    ((const unsigned char*) __Pyx_PyObject_AsString(s))\n#define __Pyx_PyObject_FromCString(s)  __Pyx_PyObject_FromString((const char*)s)\n#define __Pyx_PyBytes_FromCString(s)   __Pyx_PyBytes_FromString((const char*)s)\n#define __Pyx_PyByteArray_FromCString(s)   __Pyx_PyByteArray_FromString((const char*)s)\n#define __Pyx_PyStr_FromCString(s)     __Pyx_PyStr_FromString((const char*)s)\n#define __Pyx_PyUnicode_FromCString(s) __Pyx_PyUnicode_FromString((const char*)s)\nstatic CYTHON_INLINE size_t __Pyx_Py_UNICODE_strlen(const Py_UNICODE *u) {\n    const Py_UNICODE *u_end = u;\n    while (*u_end++) ;\n    return (size_t)(u_end - u - 1);\n}\n#define __Pyx_PyUnicode_FromUnicode(u)       PyUnicode_FromUnicode(u, __Pyx_Py_UNICODE_strlen(u))\n#define __Pyx_PyUnicode_FromUnicodeAndLength PyUnicode_FromUnicode\n#define __Pyx_PyUnicode_AsUnicode            PyUnicode_AsUnicode\n#define __Pyx_NewRef(obj) (Py_INCREF(obj), obj)\n#define __Pyx_Owned_Py_None(b) __Pyx_NewRef(Py_None)\nstatic CYTHON_INLINE PyObject * __Pyx_PyBool_FromLong(long b);\nstatic CYTHON_INLINE int __Pyx_PyObject_IsTrue(PyObject*);\nstatic CYTHON_INLINE int __Pyx_PyObject_IsTrueAndDecref(PyObject*);\nstatic CYTHON_INLINE PyObject* __Pyx_PyNumber_IntOrLong(PyObject* x);\n#define __Pyx_PySequence_Tuple(obj)\\\n    (likely(PyTuple_CheckExact(obj)) ? __Pyx_NewRef(obj) : PySequence_Tuple(obj))\nstatic CYTHON_INLINE Py_ssize_t __Pyx_PyIndex_AsSsize_t(PyObject*);\nstatic CYTHON_INLINE PyObject * __Pyx_PyInt_FromSize_t(size_t);\nstatic CYTHON_INLINE Py_hash_t __Pyx_PyIndex_AsHash_t(PyObject*);\n#if CYTHON_ASSUME_SAFE_MACROS\n#define __pyx_PyFloat_AsDouble(x) (PyFloat_CheckExact(x) ? PyFloat_AS_DOUBLE(x) : PyFloat_AsDouble(x))\n#else\n#define __pyx_PyFloat_AsDouble(x) PyFloat_AsDouble(x)\n#endif\n#define __pyx_PyFloat_AsFloat(x) ((float) __pyx_PyFloat_AsDouble(x))\n#if PY_MAJOR_VERSION >= 3\n#define __Pyx_PyNumber_Int(x) (PyLong_CheckExact(x) ? __Pyx_NewRef(x) : PyNumber_Long(x))\n#else\n#define __Pyx_PyNumber_Int(x) (PyInt_CheckExact(x) ? __Pyx_NewRef(x) : PyNumber_Int(x))\n#endif\n#define __Pyx_PyNumber_Float(x) (PyFloat_CheckExact(x) ? __Pyx_NewRef(x) : PyNumber_Float(x))\n#if PY_MAJOR_VERSION < 3 && __PYX_DEFAULT_STRING_ENCODING_IS_ASCII\nstatic int __Pyx_sys_getdefaultencoding_not_ascii;\nstatic int __Pyx_init_sys_getdefaultencoding_params(void) {\n    PyObject* sys;\n    PyObject* default_encoding = NULL;\n    PyObject* ascii_chars_u = NULL;\n    PyObject* ascii_chars_b = NULL;\n    const char* default_encoding_c;\n    sys = PyImport_ImportModule(\"sys\");\n    if (!sys) goto bad;\n    default_encoding = PyObject_CallMethod(sys, (char*) \"getdefaultencoding\", NULL);\n    Py_DECREF(sys);\n    if (!default_encoding) goto bad;\n    default_encoding_c = PyBytes_AsString(default_encoding);\n    if (!default_encoding_c) goto bad;\n    if (strcmp(default_encoding_c, \"ascii\") == 0) {\n        __Pyx_sys_getdefaultencoding_not_ascii = 0;\n    } else {\n        char ascii_chars[128];\n        int c;\n        for (c = 0; c < 128; c++) {\n            ascii_chars[c] = c;\n        }\n        __Pyx_sys_getdefaultencoding_not_ascii = 1;\n        ascii_chars_u = PyUnicode_DecodeASCII(ascii_chars, 128, NULL);\n        if (!ascii_chars_u) goto bad;\n        ascii_chars_b = PyUnicode_AsEncodedString(ascii_chars_u, default_encoding_c, NULL);\n        if (!ascii_chars_b || !PyBytes_Check(ascii_chars_b) || memcmp(ascii_chars, PyBytes_AS_STRING(ascii_chars_b), 128) != 0) {\n            PyErr_Format(\n                PyExc_ValueError,\n                \"This module compiled with c_string_encoding=ascii, but default encoding '%.200s' is not a superset of ascii.\",\n                default_encoding_c);\n            goto bad;\n        }\n        Py_DECREF(ascii_chars_u);\n        Py_DECREF(ascii_chars_b);\n    }\n    Py_DECREF(default_encoding);\n    return 0;\nbad:\n    Py_XDECREF(default_encoding);\n    Py_XDECREF(ascii_chars_u);\n    Py_XDECREF(ascii_chars_b);\n    return -1;\n}\n#endif\n#if __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT && PY_MAJOR_VERSION >= 3\n#define __Pyx_PyUnicode_FromStringAndSize(c_str, size) PyUnicode_DecodeUTF8(c_str, size, NULL)\n#else\n#define __Pyx_PyUnicode_FromStringAndSize(c_str, size) PyUnicode_Decode(c_str, size, __PYX_DEFAULT_STRING_ENCODING, NULL)\n#if __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT\nstatic char* __PYX_DEFAULT_STRING_ENCODING;\nstatic int __Pyx_init_sys_getdefaultencoding_params(void) {\n    PyObject* sys;\n    PyObject* default_encoding = NULL;\n    char* default_encoding_c;\n    sys = PyImport_ImportModule(\"sys\");\n    if (!sys) goto bad;\n    default_encoding = PyObject_CallMethod(sys, (char*) (const char*) \"getdefaultencoding\", NULL);\n    Py_DECREF(sys);\n    if (!default_encoding) goto bad;\n    default_encoding_c = PyBytes_AsString(default_encoding);\n    if (!default_encoding_c) goto bad;\n    __PYX_DEFAULT_STRING_ENCODING = (char*) malloc(strlen(default_encoding_c) + 1);\n    if (!__PYX_DEFAULT_STRING_ENCODING) goto bad;\n    strcpy(__PYX_DEFAULT_STRING_ENCODING, default_encoding_c);\n    Py_DECREF(default_encoding);\n    return 0;\nbad:\n    Py_XDECREF(default_encoding);\n    return -1;\n}\n#endif\n#endif\n\n\n/* Test for GCC > 2.95 */\n#if defined(__GNUC__)     && (__GNUC__ > 2 || (__GNUC__ == 2 && (__GNUC_MINOR__ > 95)))\n  #define likely(x)   __builtin_expect(!!(x), 1)\n  #define unlikely(x) __builtin_expect(!!(x), 0)\n#else /* !__GNUC__ or GCC < 2.95 */\n  #define likely(x)   (x)\n  #define unlikely(x) (x)\n#endif /* __GNUC__ */\nstatic CYTHON_INLINE void __Pyx_pretend_to_initialize(void* ptr) { (void)ptr; }\n\nstatic PyObject *__pyx_m = NULL;\nstatic PyObject *__pyx_d;\nstatic PyObject *__pyx_b;\nstatic PyObject *__pyx_cython_runtime = NULL;\nstatic PyObject *__pyx_empty_tuple;\nstatic PyObject *__pyx_empty_bytes;\nstatic PyObject *__pyx_empty_unicode;\nstatic int __pyx_lineno;\nstatic int __pyx_clineno = 0;\nstatic const char * __pyx_cfilenm= __FILE__;\nstatic const char *__pyx_filename;\n\n/* Header.proto */\n#if !defined(CYTHON_CCOMPLEX)\n  #if defined(__cplusplus)\n    #define CYTHON_CCOMPLEX 1\n  #elif defined(_Complex_I)\n    #define CYTHON_CCOMPLEX 1\n  #else\n    #define CYTHON_CCOMPLEX 0\n  #endif\n#endif\n#if CYTHON_CCOMPLEX\n  #ifdef __cplusplus\n    #include <complex>\n  #else\n    #include <complex.h>\n  #endif\n#endif\n#if CYTHON_CCOMPLEX && !defined(__cplusplus) && defined(__sun__) && defined(__GNUC__)\n  #undef _Complex_I\n  #define _Complex_I 1.0fj\n#endif\n\n\nstatic const char *__pyx_f[] = {\n  \"TTS\\\\tts\\\\utils\\\\monotonic_align\\\\core.pyx\",\n  \"__init__.pxd\",\n  \"stringsource\",\n  \"type.pxd\",\n};\n/* NoFastGil.proto */\n#define __Pyx_PyGILState_Ensure PyGILState_Ensure\n#define __Pyx_PyGILState_Release PyGILState_Release\n#define __Pyx_FastGIL_Remember()\n#define __Pyx_FastGIL_Forget()\n#define __Pyx_FastGilFuncInit()\n\n/* MemviewSliceStruct.proto */\nstruct __pyx_memoryview_obj;\ntypedef struct {\n  struct __pyx_memoryview_obj *memview;\n  char *data;\n  Py_ssize_t shape[8];\n  Py_ssize_t strides[8];\n  Py_ssize_t suboffsets[8];\n} __Pyx_memviewslice;\n#define __Pyx_MemoryView_Len(m)  (m.shape[0])\n\n/* Atomics.proto */\n#include <pythread.h>\n#ifndef CYTHON_ATOMICS\n    #define CYTHON_ATOMICS 1\n#endif\n#define __pyx_atomic_int_type int\n#if CYTHON_ATOMICS && __GNUC__ >= 4 && (__GNUC_MINOR__ > 1 ||\\\n                    (__GNUC_MINOR__ == 1 && __GNUC_PATCHLEVEL >= 2)) &&\\\n                    !defined(__i386__)\n    #define __pyx_atomic_incr_aligned(value, lock) __sync_fetch_and_add(value, 1)\n    #define __pyx_atomic_decr_aligned(value, lock) __sync_fetch_and_sub(value, 1)\n    #ifdef __PYX_DEBUG_ATOMICS\n        #warning \"Using GNU atomics\"\n    #endif\n#elif CYTHON_ATOMICS && defined(_MSC_VER) && 0\n    #include <Windows.h>\n    #undef __pyx_atomic_int_type\n    #define __pyx_atomic_int_type LONG\n    #define __pyx_atomic_incr_aligned(value, lock) InterlockedIncrement(value)\n    #define __pyx_atomic_decr_aligned(value, lock) InterlockedDecrement(value)\n    #ifdef __PYX_DEBUG_ATOMICS\n        #pragma message (\"Using MSVC atomics\")\n    #endif\n#elif CYTHON_ATOMICS && (defined(__ICC) || defined(__INTEL_COMPILER)) && 0\n    #define __pyx_atomic_incr_aligned(value, lock) _InterlockedIncrement(value)\n    #define __pyx_atomic_decr_aligned(value, lock) _InterlockedDecrement(value)\n    #ifdef __PYX_DEBUG_ATOMICS\n        #warning \"Using Intel atomics\"\n    #endif\n#else\n    #undef CYTHON_ATOMICS\n    #define CYTHON_ATOMICS 0\n    #ifdef __PYX_DEBUG_ATOMICS\n        #warning \"Not using atomics\"\n    #endif\n#endif\ntypedef volatile __pyx_atomic_int_type __pyx_atomic_int;\n#if CYTHON_ATOMICS\n    #define __pyx_add_acquisition_count(memview)\\\n             __pyx_atomic_incr_aligned(__pyx_get_slice_count_pointer(memview), memview->lock)\n    #define __pyx_sub_acquisition_count(memview)\\\n            __pyx_atomic_decr_aligned(__pyx_get_slice_count_pointer(memview), memview->lock)\n#else\n    #define __pyx_add_acquisition_count(memview)\\\n            __pyx_add_acquisition_count_locked(__pyx_get_slice_count_pointer(memview), memview->lock)\n    #define __pyx_sub_acquisition_count(memview)\\\n            __pyx_sub_acquisition_count_locked(__pyx_get_slice_count_pointer(memview), memview->lock)\n#endif\n\n/* ForceInitThreads.proto */\n#ifndef __PYX_FORCE_INIT_THREADS\n  #define __PYX_FORCE_INIT_THREADS 0\n#endif\n\n/* BufferFormatStructs.proto */\n#define IS_UNSIGNED(type) (((type) -1) > 0)\nstruct __Pyx_StructField_;\n#define __PYX_BUF_FLAGS_PACKED_STRUCT (1 << 0)\ntypedef struct {\n  const char* name;\n  struct __Pyx_StructField_* fields;\n  size_t size;\n  size_t arraysize[8];\n  int ndim;\n  char typegroup;\n  char is_unsigned;\n  int flags;\n} __Pyx_TypeInfo;\ntypedef struct __Pyx_StructField_ {\n  __Pyx_TypeInfo* type;\n  const char* name;\n  size_t offset;\n} __Pyx_StructField;\ntypedef struct {\n  __Pyx_StructField* field;\n  size_t parent_offset;\n} __Pyx_BufFmt_StackElem;\ntypedef struct {\n  __Pyx_StructField root;\n  __Pyx_BufFmt_StackElem* head;\n  size_t fmt_offset;\n  size_t new_count, enc_count;\n  size_t struct_alignment;\n  int is_complex;\n  char enc_type;\n  char new_packmode;\n  char enc_packmode;\n  char is_valid_array;\n} __Pyx_BufFmt_Context;\n\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":690\n * # in Cython to enable them only on the right systems.\n * \n * ctypedef npy_int8       int8_t             # <<<<<<<<<<<<<<\n * ctypedef npy_int16      int16_t\n * ctypedef npy_int32      int32_t\n */\ntypedef npy_int8 __pyx_t_5numpy_int8_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":691\n * \n * ctypedef npy_int8       int8_t\n * ctypedef npy_int16      int16_t             # <<<<<<<<<<<<<<\n * ctypedef npy_int32      int32_t\n * ctypedef npy_int64      int64_t\n */\ntypedef npy_int16 __pyx_t_5numpy_int16_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":692\n * ctypedef npy_int8       int8_t\n * ctypedef npy_int16      int16_t\n * ctypedef npy_int32      int32_t             # <<<<<<<<<<<<<<\n * ctypedef npy_int64      int64_t\n * #ctypedef npy_int96      int96_t\n */\ntypedef npy_int32 __pyx_t_5numpy_int32_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":693\n * ctypedef npy_int16      int16_t\n * ctypedef npy_int32      int32_t\n * ctypedef npy_int64      int64_t             # <<<<<<<<<<<<<<\n * #ctypedef npy_int96      int96_t\n * #ctypedef npy_int128     int128_t\n */\ntypedef npy_int64 __pyx_t_5numpy_int64_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":697\n * #ctypedef npy_int128     int128_t\n * \n * ctypedef npy_uint8      uint8_t             # <<<<<<<<<<<<<<\n * ctypedef npy_uint16     uint16_t\n * ctypedef npy_uint32     uint32_t\n */\ntypedef npy_uint8 __pyx_t_5numpy_uint8_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":698\n * \n * ctypedef npy_uint8      uint8_t\n * ctypedef npy_uint16     uint16_t             # <<<<<<<<<<<<<<\n * ctypedef npy_uint32     uint32_t\n * ctypedef npy_uint64     uint64_t\n */\ntypedef npy_uint16 __pyx_t_5numpy_uint16_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":699\n * ctypedef npy_uint8      uint8_t\n * ctypedef npy_uint16     uint16_t\n * ctypedef npy_uint32     uint32_t             # <<<<<<<<<<<<<<\n * ctypedef npy_uint64     uint64_t\n * #ctypedef npy_uint96     uint96_t\n */\ntypedef npy_uint32 __pyx_t_5numpy_uint32_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":700\n * ctypedef npy_uint16     uint16_t\n * ctypedef npy_uint32     uint32_t\n * ctypedef npy_uint64     uint64_t             # <<<<<<<<<<<<<<\n * #ctypedef npy_uint96     uint96_t\n * #ctypedef npy_uint128    uint128_t\n */\ntypedef npy_uint64 __pyx_t_5numpy_uint64_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":704\n * #ctypedef npy_uint128    uint128_t\n * \n * ctypedef npy_float32    float32_t             # <<<<<<<<<<<<<<\n * ctypedef npy_float64    float64_t\n * #ctypedef npy_float80    float80_t\n */\ntypedef npy_float32 __pyx_t_5numpy_float32_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":705\n * \n * ctypedef npy_float32    float32_t\n * ctypedef npy_float64    float64_t             # <<<<<<<<<<<<<<\n * #ctypedef npy_float80    float80_t\n * #ctypedef npy_float128   float128_t\n */\ntypedef npy_float64 __pyx_t_5numpy_float64_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":714\n * # The int types are mapped a bit surprising --\n * # numpy.int corresponds to 'l' and numpy.long to 'q'\n * ctypedef npy_long       int_t             # <<<<<<<<<<<<<<\n * ctypedef npy_longlong   long_t\n * ctypedef npy_longlong   longlong_t\n */\ntypedef npy_long __pyx_t_5numpy_int_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":715\n * # numpy.int corresponds to 'l' and numpy.long to 'q'\n * ctypedef npy_long       int_t\n * ctypedef npy_longlong   long_t             # <<<<<<<<<<<<<<\n * ctypedef npy_longlong   longlong_t\n * \n */\ntypedef npy_longlong __pyx_t_5numpy_long_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":716\n * ctypedef npy_long       int_t\n * ctypedef npy_longlong   long_t\n * ctypedef npy_longlong   longlong_t             # <<<<<<<<<<<<<<\n * \n * ctypedef npy_ulong      uint_t\n */\ntypedef npy_longlong __pyx_t_5numpy_longlong_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":718\n * ctypedef npy_longlong   longlong_t\n * \n * ctypedef npy_ulong      uint_t             # <<<<<<<<<<<<<<\n * ctypedef npy_ulonglong  ulong_t\n * ctypedef npy_ulonglong  ulonglong_t\n */\ntypedef npy_ulong __pyx_t_5numpy_uint_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":719\n * \n * ctypedef npy_ulong      uint_t\n * ctypedef npy_ulonglong  ulong_t             # <<<<<<<<<<<<<<\n * ctypedef npy_ulonglong  ulonglong_t\n * \n */\ntypedef npy_ulonglong __pyx_t_5numpy_ulong_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":720\n * ctypedef npy_ulong      uint_t\n * ctypedef npy_ulonglong  ulong_t\n * ctypedef npy_ulonglong  ulonglong_t             # <<<<<<<<<<<<<<\n * \n * ctypedef npy_intp       intp_t\n */\ntypedef npy_ulonglong __pyx_t_5numpy_ulonglong_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":722\n * ctypedef npy_ulonglong  ulonglong_t\n * \n * ctypedef npy_intp       intp_t             # <<<<<<<<<<<<<<\n * ctypedef npy_uintp      uintp_t\n * \n */\ntypedef npy_intp __pyx_t_5numpy_intp_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":723\n * \n * ctypedef npy_intp       intp_t\n * ctypedef npy_uintp      uintp_t             # <<<<<<<<<<<<<<\n * \n * ctypedef npy_double     float_t\n */\ntypedef npy_uintp __pyx_t_5numpy_uintp_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":725\n * ctypedef npy_uintp      uintp_t\n * \n * ctypedef npy_double     float_t             # <<<<<<<<<<<<<<\n * ctypedef npy_double     double_t\n * ctypedef npy_longdouble longdouble_t\n */\ntypedef npy_double __pyx_t_5numpy_float_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":726\n * \n * ctypedef npy_double     float_t\n * ctypedef npy_double     double_t             # <<<<<<<<<<<<<<\n * ctypedef npy_longdouble longdouble_t\n * \n */\ntypedef npy_double __pyx_t_5numpy_double_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":727\n * ctypedef npy_double     float_t\n * ctypedef npy_double     double_t\n * ctypedef npy_longdouble longdouble_t             # <<<<<<<<<<<<<<\n * \n * ctypedef npy_cfloat      cfloat_t\n */\ntypedef npy_longdouble __pyx_t_5numpy_longdouble_t;\n/* Declarations.proto */\n#if CYTHON_CCOMPLEX\n  #ifdef __cplusplus\n    typedef ::std::complex< float > __pyx_t_float_complex;\n  #else\n    typedef float _Complex __pyx_t_float_complex;\n  #endif\n#else\n    typedef struct { float real, imag; } __pyx_t_float_complex;\n#endif\nstatic CYTHON_INLINE __pyx_t_float_complex __pyx_t_float_complex_from_parts(float, float);\n\n/* Declarations.proto */\n#if CYTHON_CCOMPLEX\n  #ifdef __cplusplus\n    typedef ::std::complex< double > __pyx_t_double_complex;\n  #else\n    typedef double _Complex __pyx_t_double_complex;\n  #endif\n#else\n    typedef struct { double real, imag; } __pyx_t_double_complex;\n#endif\nstatic CYTHON_INLINE __pyx_t_double_complex __pyx_t_double_complex_from_parts(double, double);\n\n\n/*--- Type declarations ---*/\nstruct __pyx_array_obj;\nstruct __pyx_MemviewEnum_obj;\nstruct __pyx_memoryview_obj;\nstruct __pyx_memoryviewslice_obj;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":729\n * ctypedef npy_longdouble longdouble_t\n * \n * ctypedef npy_cfloat      cfloat_t             # <<<<<<<<<<<<<<\n * ctypedef npy_cdouble     cdouble_t\n * ctypedef npy_clongdouble clongdouble_t\n */\ntypedef npy_cfloat __pyx_t_5numpy_cfloat_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":730\n * \n * ctypedef npy_cfloat      cfloat_t\n * ctypedef npy_cdouble     cdouble_t             # <<<<<<<<<<<<<<\n * ctypedef npy_clongdouble clongdouble_t\n * \n */\ntypedef npy_cdouble __pyx_t_5numpy_cdouble_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":731\n * ctypedef npy_cfloat      cfloat_t\n * ctypedef npy_cdouble     cdouble_t\n * ctypedef npy_clongdouble clongdouble_t             # <<<<<<<<<<<<<<\n * \n * ctypedef npy_cdouble     complex_t\n */\ntypedef npy_clongdouble __pyx_t_5numpy_clongdouble_t;\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":733\n * ctypedef npy_clongdouble clongdouble_t\n * \n * ctypedef npy_cdouble     complex_t             # <<<<<<<<<<<<<<\n * \n * cdef inline object PyArray_MultiIterNew1(a):\n */\ntypedef npy_cdouble __pyx_t_5numpy_complex_t;\nstruct __pyx_opt_args_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_c;\n\n/* \"TTS/tts/utils/monotonic_align/core.pyx\":42\n * @cython.boundscheck(False)\n * @cython.wraparound(False)\n * cpdef void maximum_path_c(int[:,:,::1] paths, float[:,:,::1] values, int[::1] t_xs, int[::1] t_ys, float max_neg_val=-1e9) nogil:             # <<<<<<<<<<<<<<\n *   cdef int b = values.shape[0]\n * \n */\nstruct __pyx_opt_args_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_c {\n  int __pyx_n;\n  float max_neg_val;\n};\n\n/* \"View.MemoryView\":105\n * \n * @cname(\"__pyx_array\")\n * cdef class array:             # <<<<<<<<<<<<<<\n * \n *     cdef:\n */\nstruct __pyx_array_obj {\n  PyObject_HEAD\n  struct __pyx_vtabstruct_array *__pyx_vtab;\n  char *data;\n  Py_ssize_t len;\n  char *format;\n  int ndim;\n  Py_ssize_t *_shape;\n  Py_ssize_t *_strides;\n  Py_ssize_t itemsize;\n  PyObject *mode;\n  PyObject *_format;\n  void (*callback_free_data)(void *);\n  int free_data;\n  int dtype_is_object;\n};\n\n\n/* \"View.MemoryView\":279\n * \n * @cname('__pyx_MemviewEnum')\n * cdef class Enum(object):             # <<<<<<<<<<<<<<\n *     cdef object name\n *     def __init__(self, name):\n */\nstruct __pyx_MemviewEnum_obj {\n  PyObject_HEAD\n  PyObject *name;\n};\n\n\n/* \"View.MemoryView\":330\n * \n * @cname('__pyx_memoryview')\n * cdef class memoryview(object):             # <<<<<<<<<<<<<<\n * \n *     cdef object obj\n */\nstruct __pyx_memoryview_obj {\n  PyObject_HEAD\n  struct __pyx_vtabstruct_memoryview *__pyx_vtab;\n  PyObject *obj;\n  PyObject *_size;\n  PyObject *_array_interface;\n  PyThread_type_lock lock;\n  __pyx_atomic_int acquisition_count[2];\n  __pyx_atomic_int *acquisition_count_aligned_p;\n  Py_buffer view;\n  int flags;\n  int dtype_is_object;\n  __Pyx_TypeInfo *typeinfo;\n};\n\n\n/* \"View.MemoryView\":965\n * \n * @cname('__pyx_memoryviewslice')\n * cdef class _memoryviewslice(memoryview):             # <<<<<<<<<<<<<<\n *     \"Internal class for passing memoryview slices to Python\"\n * \n */\nstruct __pyx_memoryviewslice_obj {\n  struct __pyx_memoryview_obj __pyx_base;\n  __Pyx_memviewslice from_slice;\n  PyObject *from_object;\n  PyObject *(*to_object_func)(char *);\n  int (*to_dtype_func)(char *, PyObject *);\n};\n\n\n\n/* \"View.MemoryView\":105\n * \n * @cname(\"__pyx_array\")\n * cdef class array:             # <<<<<<<<<<<<<<\n * \n *     cdef:\n */\n\nstruct __pyx_vtabstruct_array {\n  PyObject *(*get_memview)(struct __pyx_array_obj *);\n};\nstatic struct __pyx_vtabstruct_array *__pyx_vtabptr_array;\n\n\n/* \"View.MemoryView\":330\n * \n * @cname('__pyx_memoryview')\n * cdef class memoryview(object):             # <<<<<<<<<<<<<<\n * \n *     cdef object obj\n */\n\nstruct __pyx_vtabstruct_memoryview {\n  char *(*get_item_pointer)(struct __pyx_memoryview_obj *, PyObject *);\n  PyObject *(*is_slice)(struct __pyx_memoryview_obj *, PyObject *);\n  PyObject *(*setitem_slice_assignment)(struct __pyx_memoryview_obj *, PyObject *, PyObject *);\n  PyObject *(*setitem_slice_assign_scalar)(struct __pyx_memoryview_obj *, struct __pyx_memoryview_obj *, PyObject *);\n  PyObject *(*setitem_indexed)(struct __pyx_memoryview_obj *, PyObject *, PyObject *);\n  PyObject *(*convert_item_to_object)(struct __pyx_memoryview_obj *, char *);\n  PyObject *(*assign_item_from_object)(struct __pyx_memoryview_obj *, char *, PyObject *);\n};\nstatic struct __pyx_vtabstruct_memoryview *__pyx_vtabptr_memoryview;\n\n\n/* \"View.MemoryView\":965\n * \n * @cname('__pyx_memoryviewslice')\n * cdef class _memoryviewslice(memoryview):             # <<<<<<<<<<<<<<\n *     \"Internal class for passing memoryview slices to Python\"\n * \n */\n\nstruct __pyx_vtabstruct__memoryviewslice {\n  struct __pyx_vtabstruct_memoryview __pyx_base;\n};\nstatic struct __pyx_vtabstruct__memoryviewslice *__pyx_vtabptr__memoryviewslice;\n\n/* --- Runtime support code (head) --- */\n/* Refnanny.proto */\n#ifndef CYTHON_REFNANNY\n  #define CYTHON_REFNANNY 0\n#endif\n#if CYTHON_REFNANNY\n  typedef struct {\n    void (*INCREF)(void*, PyObject*, int);\n    void (*DECREF)(void*, PyObject*, int);\n    void (*GOTREF)(void*, PyObject*, int);\n    void (*GIVEREF)(void*, PyObject*, int);\n    void* (*SetupContext)(const char*, int, const char*);\n    void (*FinishContext)(void**);\n  } __Pyx_RefNannyAPIStruct;\n  static __Pyx_RefNannyAPIStruct *__Pyx_RefNanny = NULL;\n  static __Pyx_RefNannyAPIStruct *__Pyx_RefNannyImportAPI(const char *modname);\n  #define __Pyx_RefNannyDeclarations void *__pyx_refnanny = NULL;\n#ifdef WITH_THREAD\n  #define __Pyx_RefNannySetupContext(name, acquire_gil)\\\n          if (acquire_gil) {\\\n              PyGILState_STATE __pyx_gilstate_save = PyGILState_Ensure();\\\n              __pyx_refnanny = __Pyx_RefNanny->SetupContext((name), __LINE__, __FILE__);\\\n              PyGILState_Release(__pyx_gilstate_save);\\\n          } else {\\\n              __pyx_refnanny = __Pyx_RefNanny->SetupContext((name), __LINE__, __FILE__);\\\n          }\n#else\n  #define __Pyx_RefNannySetupContext(name, acquire_gil)\\\n          __pyx_refnanny = __Pyx_RefNanny->SetupContext((name), __LINE__, __FILE__)\n#endif\n  #define __Pyx_RefNannyFinishContext()\\\n          __Pyx_RefNanny->FinishContext(&__pyx_refnanny)\n  #define __Pyx_INCREF(r)  __Pyx_RefNanny->INCREF(__pyx_refnanny, (PyObject *)(r), __LINE__)\n  #define __Pyx_DECREF(r)  __Pyx_RefNanny->DECREF(__pyx_refnanny, (PyObject *)(r), __LINE__)\n  #define __Pyx_GOTREF(r)  __Pyx_RefNanny->GOTREF(__pyx_refnanny, (PyObject *)(r), __LINE__)\n  #define __Pyx_GIVEREF(r) __Pyx_RefNanny->GIVEREF(__pyx_refnanny, (PyObject *)(r), __LINE__)\n  #define __Pyx_XINCREF(r)  do { if((r) != NULL) {__Pyx_INCREF(r); }} while(0)\n  #define __Pyx_XDECREF(r)  do { if((r) != NULL) {__Pyx_DECREF(r); }} while(0)\n  #define __Pyx_XGOTREF(r)  do { if((r) != NULL) {__Pyx_GOTREF(r); }} while(0)\n  #define __Pyx_XGIVEREF(r) do { if((r) != NULL) {__Pyx_GIVEREF(r);}} while(0)\n#else\n  #define __Pyx_RefNannyDeclarations\n  #define __Pyx_RefNannySetupContext(name, acquire_gil)\n  #define __Pyx_RefNannyFinishContext()\n  #define __Pyx_INCREF(r) Py_INCREF(r)\n  #define __Pyx_DECREF(r) Py_DECREF(r)\n  #define __Pyx_GOTREF(r)\n  #define __Pyx_GIVEREF(r)\n  #define __Pyx_XINCREF(r) Py_XINCREF(r)\n  #define __Pyx_XDECREF(r) Py_XDECREF(r)\n  #define __Pyx_XGOTREF(r)\n  #define __Pyx_XGIVEREF(r)\n#endif\n#define __Pyx_XDECREF_SET(r, v) do {\\\n        PyObject *tmp = (PyObject *) r;\\\n        r = v; __Pyx_XDECREF(tmp);\\\n    } while (0)\n#define __Pyx_DECREF_SET(r, v) do {\\\n        PyObject *tmp = (PyObject *) r;\\\n        r = v; __Pyx_DECREF(tmp);\\\n    } while (0)\n#define __Pyx_CLEAR(r)    do { PyObject* tmp = ((PyObject*)(r)); r = NULL; __Pyx_DECREF(tmp);} while(0)\n#define __Pyx_XCLEAR(r)   do { if((r) != NULL) {PyObject* tmp = ((PyObject*)(r)); r = NULL; __Pyx_DECREF(tmp);}} while(0)\n\n/* PyObjectGetAttrStr.proto */\n#if CYTHON_USE_TYPE_SLOTS\nstatic CYTHON_INLINE PyObject* __Pyx_PyObject_GetAttrStr(PyObject* obj, PyObject* attr_name);\n#else\n#define __Pyx_PyObject_GetAttrStr(o,n) PyObject_GetAttr(o,n)\n#endif\n\n/* GetBuiltinName.proto */\nstatic PyObject *__Pyx_GetBuiltinName(PyObject *name);\n\n/* MemviewSliceInit.proto */\n#define __Pyx_BUF_MAX_NDIMS %(BUF_MAX_NDIMS)d\n#define __Pyx_MEMVIEW_DIRECT   1\n#define __Pyx_MEMVIEW_PTR      2\n#define __Pyx_MEMVIEW_FULL     4\n#define __Pyx_MEMVIEW_CONTIG   8\n#define __Pyx_MEMVIEW_STRIDED  16\n#define __Pyx_MEMVIEW_FOLLOW   32\n#define __Pyx_IS_C_CONTIG 1\n#define __Pyx_IS_F_CONTIG 2\nstatic int __Pyx_init_memviewslice(\n                struct __pyx_memoryview_obj *memview,\n                int ndim,\n                __Pyx_memviewslice *memviewslice,\n                int memview_is_new_reference);\nstatic CYTHON_INLINE int __pyx_add_acquisition_count_locked(\n    __pyx_atomic_int *acquisition_count, PyThread_type_lock lock);\nstatic CYTHON_INLINE int __pyx_sub_acquisition_count_locked(\n    __pyx_atomic_int *acquisition_count, PyThread_type_lock lock);\n#define __pyx_get_slice_count_pointer(memview) (memview->acquisition_count_aligned_p)\n#define __pyx_get_slice_count(memview) (*__pyx_get_slice_count_pointer(memview))\n#define __PYX_INC_MEMVIEW(slice, have_gil) __Pyx_INC_MEMVIEW(slice, have_gil, __LINE__)\n#define __PYX_XDEC_MEMVIEW(slice, have_gil) __Pyx_XDEC_MEMVIEW(slice, have_gil, __LINE__)\nstatic CYTHON_INLINE void __Pyx_INC_MEMVIEW(__Pyx_memviewslice *, int, int);\nstatic CYTHON_INLINE void __Pyx_XDEC_MEMVIEW(__Pyx_memviewslice *, int, int);\n\n/* RaiseArgTupleInvalid.proto */\nstatic void __Pyx_RaiseArgtupleInvalid(const char* func_name, int exact,\n    Py_ssize_t num_min, Py_ssize_t num_max, Py_ssize_t num_found);\n\n/* RaiseDoubleKeywords.proto */\nstatic void __Pyx_RaiseDoubleKeywordsError(const char* func_name, PyObject* kw_name);\n\n/* ParseKeywords.proto */\nstatic int __Pyx_ParseOptionalKeywords(PyObject *kwds, PyObject **argnames[],\\\n    PyObject *kwds2, PyObject *values[], Py_ssize_t num_pos_args,\\\n    const char* function_name);\n\n/* None.proto */\nstatic CYTHON_INLINE void __Pyx_RaiseUnboundLocalError(const char *varname);\n\n/* GetTopmostException.proto */\n#if CYTHON_USE_EXC_INFO_STACK\nstatic _PyErr_StackItem * __Pyx_PyErr_GetTopmostException(PyThreadState *tstate);\n#endif\n\n/* PyThreadStateGet.proto */\n#if CYTHON_FAST_THREAD_STATE\n#define __Pyx_PyThreadState_declare  PyThreadState *__pyx_tstate;\n#define __Pyx_PyThreadState_assign  __pyx_tstate = __Pyx_PyThreadState_Current;\n#define __Pyx_PyErr_Occurred()  __pyx_tstate->curexc_type\n#else\n#define __Pyx_PyThreadState_declare\n#define __Pyx_PyThreadState_assign\n#define __Pyx_PyErr_Occurred()  PyErr_Occurred()\n#endif\n\n/* SaveResetException.proto */\n#if CYTHON_FAST_THREAD_STATE\n#define __Pyx_ExceptionSave(type, value, tb)  __Pyx__ExceptionSave(__pyx_tstate, type, value, tb)\nstatic CYTHON_INLINE void __Pyx__ExceptionSave(PyThreadState *tstate, PyObject **type, PyObject **value, PyObject **tb);\n#define __Pyx_ExceptionReset(type, value, tb)  __Pyx__ExceptionReset(__pyx_tstate, type, value, tb)\nstatic CYTHON_INLINE void __Pyx__ExceptionReset(PyThreadState *tstate, PyObject *type, PyObject *value, PyObject *tb);\n#else\n#define __Pyx_ExceptionSave(type, value, tb)   PyErr_GetExcInfo(type, value, tb)\n#define __Pyx_ExceptionReset(type, value, tb)  PyErr_SetExcInfo(type, value, tb)\n#endif\n\n/* PyErrExceptionMatches.proto */\n#if CYTHON_FAST_THREAD_STATE\n#define __Pyx_PyErr_ExceptionMatches(err) __Pyx_PyErr_ExceptionMatchesInState(__pyx_tstate, err)\nstatic CYTHON_INLINE int __Pyx_PyErr_ExceptionMatchesInState(PyThreadState* tstate, PyObject* err);\n#else\n#define __Pyx_PyErr_ExceptionMatches(err)  PyErr_ExceptionMatches(err)\n#endif\n\n/* GetException.proto */\n#if CYTHON_FAST_THREAD_STATE\n#define __Pyx_GetException(type, value, tb)  __Pyx__GetException(__pyx_tstate, type, value, tb)\nstatic int __Pyx__GetException(PyThreadState *tstate, PyObject **type, PyObject **value, PyObject **tb);\n#else\nstatic int __Pyx_GetException(PyObject **type, PyObject **value, PyObject **tb);\n#endif\n\n/* PyObjectCall.proto */\n#if CYTHON_COMPILING_IN_CPYTHON\nstatic CYTHON_INLINE PyObject* __Pyx_PyObject_Call(PyObject *func, PyObject *arg, PyObject *kw);\n#else\n#define __Pyx_PyObject_Call(func, arg, kw) PyObject_Call(func, arg, kw)\n#endif\n\n/* PyErrFetchRestore.proto */\n#if CYTHON_FAST_THREAD_STATE\n#define __Pyx_PyErr_Clear() __Pyx_ErrRestore(NULL, NULL, NULL)\n#define __Pyx_ErrRestoreWithState(type, value, tb)  __Pyx_ErrRestoreInState(PyThreadState_GET(), type, value, tb)\n#define __Pyx_ErrFetchWithState(type, value, tb)    __Pyx_ErrFetchInState(PyThreadState_GET(), type, value, tb)\n#define __Pyx_ErrRestore(type, value, tb)  __Pyx_ErrRestoreInState(__pyx_tstate, type, value, tb)\n#define __Pyx_ErrFetch(type, value, tb)    __Pyx_ErrFetchInState(__pyx_tstate, type, value, tb)\nstatic CYTHON_INLINE void __Pyx_ErrRestoreInState(PyThreadState *tstate, PyObject *type, PyObject *value, PyObject *tb);\nstatic CYTHON_INLINE void __Pyx_ErrFetchInState(PyThreadState *tstate, PyObject **type, PyObject **value, PyObject **tb);\n#if CYTHON_COMPILING_IN_CPYTHON\n#define __Pyx_PyErr_SetNone(exc) (Py_INCREF(exc), __Pyx_ErrRestore((exc), NULL, NULL))\n#else\n#define __Pyx_PyErr_SetNone(exc) PyErr_SetNone(exc)\n#endif\n#else\n#define __Pyx_PyErr_Clear() PyErr_Clear()\n#define __Pyx_PyErr_SetNone(exc) PyErr_SetNone(exc)\n#define __Pyx_ErrRestoreWithState(type, value, tb)  PyErr_Restore(type, value, tb)\n#define __Pyx_ErrFetchWithState(type, value, tb)  PyErr_Fetch(type, value, tb)\n#define __Pyx_ErrRestoreInState(tstate, type, value, tb)  PyErr_Restore(type, value, tb)\n#define __Pyx_ErrFetchInState(tstate, type, value, tb)  PyErr_Fetch(type, value, tb)\n#define __Pyx_ErrRestore(type, value, tb)  PyErr_Restore(type, value, tb)\n#define __Pyx_ErrFetch(type, value, tb)  PyErr_Fetch(type, value, tb)\n#endif\n\n/* RaiseException.proto */\nstatic void __Pyx_Raise(PyObject *type, PyObject *value, PyObject *tb, PyObject *cause);\n\n/* ArgTypeTest.proto */\n#define __Pyx_ArgTypeTest(obj, type, none_allowed, name, exact)\\\n    ((likely((Py_TYPE(obj) == type) | (none_allowed && (obj == Py_None)))) ? 1 :\\\n        __Pyx__ArgTypeTest(obj, type, name, exact))\nstatic int __Pyx__ArgTypeTest(PyObject *obj, PyTypeObject *type, const char *name, int exact);\n\n/* PyCFunctionFastCall.proto */\n#if CYTHON_FAST_PYCCALL\nstatic CYTHON_INLINE PyObject *__Pyx_PyCFunction_FastCall(PyObject *func, PyObject **args, Py_ssize_t nargs);\n#else\n#define __Pyx_PyCFunction_FastCall(func, args, nargs)  (assert(0), NULL)\n#endif\n\n/* PyFunctionFastCall.proto */\n#if CYTHON_FAST_PYCALL\n#define __Pyx_PyFunction_FastCall(func, args, nargs)\\\n    __Pyx_PyFunction_FastCallDict((func), (args), (nargs), NULL)\n#if 1 || PY_VERSION_HEX < 0x030600B1\nstatic PyObject *__Pyx_PyFunction_FastCallDict(PyObject *func, PyObject **args, Py_ssize_t nargs, PyObject *kwargs);\n#else\n#define __Pyx_PyFunction_FastCallDict(func, args, nargs, kwargs) _PyFunction_FastCallDict(func, args, nargs, kwargs)\n#endif\n#define __Pyx_BUILD_ASSERT_EXPR(cond)\\\n    (sizeof(char [1 - 2*!(cond)]) - 1)\n#ifndef Py_MEMBER_SIZE\n#define Py_MEMBER_SIZE(type, member) sizeof(((type *)0)->member)\n#endif\n#if CYTHON_FAST_PYCALL\n  static size_t __pyx_pyframe_localsplus_offset = 0;\n  #include \"frameobject.h\"\n  #define __Pxy_PyFrame_Initialize_Offsets()\\\n    ((void)__Pyx_BUILD_ASSERT_EXPR(sizeof(PyFrameObject) == offsetof(PyFrameObject, f_localsplus) + Py_MEMBER_SIZE(PyFrameObject, f_localsplus)),\\\n     (void)(__pyx_pyframe_localsplus_offset = ((size_t)PyFrame_Type.tp_basicsize) - Py_MEMBER_SIZE(PyFrameObject, f_localsplus)))\n  #define __Pyx_PyFrame_GetLocalsplus(frame)\\\n    (assert(__pyx_pyframe_localsplus_offset), (PyObject **)(((char *)(frame)) + __pyx_pyframe_localsplus_offset))\n#endif // CYTHON_FAST_PYCALL\n#endif\n\n/* PyObjectCall2Args.proto */\nstatic CYTHON_UNUSED PyObject* __Pyx_PyObject_Call2Args(PyObject* function, PyObject* arg1, PyObject* arg2);\n\n/* PyObjectCallMethO.proto */\n#if CYTHON_COMPILING_IN_CPYTHON\nstatic CYTHON_INLINE PyObject* __Pyx_PyObject_CallMethO(PyObject *func, PyObject *arg);\n#endif\n\n/* PyObjectCallOneArg.proto */\nstatic CYTHON_INLINE PyObject* __Pyx_PyObject_CallOneArg(PyObject *func, PyObject *arg);\n\n/* IncludeStringH.proto */\n#include <string.h>\n\n/* BytesEquals.proto */\nstatic CYTHON_INLINE int __Pyx_PyBytes_Equals(PyObject* s1, PyObject* s2, int equals);\n\n/* UnicodeEquals.proto */\nstatic CYTHON_INLINE int __Pyx_PyUnicode_Equals(PyObject* s1, PyObject* s2, int equals);\n\n/* StrEquals.proto */\n#if PY_MAJOR_VERSION >= 3\n#define __Pyx_PyString_Equals __Pyx_PyUnicode_Equals\n#else\n#define __Pyx_PyString_Equals __Pyx_PyBytes_Equals\n#endif\n\n/* DivInt[Py_ssize_t].proto */\nstatic CYTHON_INLINE Py_ssize_t __Pyx_div_Py_ssize_t(Py_ssize_t, Py_ssize_t);\n\n/* UnaryNegOverflows.proto */\n#define UNARY_NEG_WOULD_OVERFLOW(x)\\\n        (((x) < 0) & ((unsigned long)(x) == 0-(unsigned long)(x)))\n\nstatic CYTHON_UNUSED int __pyx_array_getbuffer(PyObject *__pyx_v_self, Py_buffer *__pyx_v_info, int __pyx_v_flags); /*proto*/\nstatic PyObject *__pyx_array_get_memview(struct __pyx_array_obj *); /*proto*/\n/* GetAttr.proto */\nstatic CYTHON_INLINE PyObject *__Pyx_GetAttr(PyObject *, PyObject *);\n\n/* GetItemInt.proto */\n#define __Pyx_GetItemInt(o, i, type, is_signed, to_py_func, is_list, wraparound, boundscheck)\\\n    (__Pyx_fits_Py_ssize_t(i, type, is_signed) ?\\\n    __Pyx_GetItemInt_Fast(o, (Py_ssize_t)i, is_list, wraparound, boundscheck) :\\\n    (is_list ? (PyErr_SetString(PyExc_IndexError, \"list index out of range\"), (PyObject*)NULL) :\\\n               __Pyx_GetItemInt_Generic(o, to_py_func(i))))\n#define __Pyx_GetItemInt_List(o, i, type, is_signed, to_py_func, is_list, wraparound, boundscheck)\\\n    (__Pyx_fits_Py_ssize_t(i, type, is_signed) ?\\\n    __Pyx_GetItemInt_List_Fast(o, (Py_ssize_t)i, wraparound, boundscheck) :\\\n    (PyErr_SetString(PyExc_IndexError, \"list index out of range\"), (PyObject*)NULL))\nstatic CYTHON_INLINE PyObject *__Pyx_GetItemInt_List_Fast(PyObject *o, Py_ssize_t i,\n                                                              int wraparound, int boundscheck);\n#define __Pyx_GetItemInt_Tuple(o, i, type, is_signed, to_py_func, is_list, wraparound, boundscheck)\\\n    (__Pyx_fits_Py_ssize_t(i, type, is_signed) ?\\\n    __Pyx_GetItemInt_Tuple_Fast(o, (Py_ssize_t)i, wraparound, boundscheck) :\\\n    (PyErr_SetString(PyExc_IndexError, \"tuple index out of range\"), (PyObject*)NULL))\nstatic CYTHON_INLINE PyObject *__Pyx_GetItemInt_Tuple_Fast(PyObject *o, Py_ssize_t i,\n                                                              int wraparound, int boundscheck);\nstatic PyObject *__Pyx_GetItemInt_Generic(PyObject *o, PyObject* j);\nstatic CYTHON_INLINE PyObject *__Pyx_GetItemInt_Fast(PyObject *o, Py_ssize_t i,\n                                                     int is_list, int wraparound, int boundscheck);\n\n/* ObjectGetItem.proto */\n#if CYTHON_USE_TYPE_SLOTS\nstatic CYTHON_INLINE PyObject *__Pyx_PyObject_GetItem(PyObject *obj, PyObject* key);\n#else\n#define __Pyx_PyObject_GetItem(obj, key)  PyObject_GetItem(obj, key)\n#endif\n\n/* decode_c_string_utf16.proto */\nstatic CYTHON_INLINE PyObject *__Pyx_PyUnicode_DecodeUTF16(const char *s, Py_ssize_t size, const char *errors) {\n    int byteorder = 0;\n    return PyUnicode_DecodeUTF16(s, size, errors, &byteorder);\n}\nstatic CYTHON_INLINE PyObject *__Pyx_PyUnicode_DecodeUTF16LE(const char *s, Py_ssize_t size, const char *errors) {\n    int byteorder = -1;\n    return PyUnicode_DecodeUTF16(s, size, errors, &byteorder);\n}\nstatic CYTHON_INLINE PyObject *__Pyx_PyUnicode_DecodeUTF16BE(const char *s, Py_ssize_t size, const char *errors) {\n    int byteorder = 1;\n    return PyUnicode_DecodeUTF16(s, size, errors, &byteorder);\n}\n\n/* decode_c_string.proto */\nstatic CYTHON_INLINE PyObject* __Pyx_decode_c_string(\n         const char* cstring, Py_ssize_t start, Py_ssize_t stop,\n         const char* encoding, const char* errors,\n         PyObject* (*decode_func)(const char *s, Py_ssize_t size, const char *errors));\n\n/* GetAttr3.proto */\nstatic CYTHON_INLINE PyObject *__Pyx_GetAttr3(PyObject *, PyObject *, PyObject *);\n\n/* PyDictVersioning.proto */\n#if CYTHON_USE_DICT_VERSIONS && CYTHON_USE_TYPE_SLOTS\n#define __PYX_DICT_VERSION_INIT  ((PY_UINT64_T) -1)\n#define __PYX_GET_DICT_VERSION(dict)  (((PyDictObject*)(dict))->ma_version_tag)\n#define __PYX_UPDATE_DICT_CACHE(dict, value, cache_var, version_var)\\\n    (version_var) = __PYX_GET_DICT_VERSION(dict);\\\n    (cache_var) = (value);\n#define __PYX_PY_DICT_LOOKUP_IF_MODIFIED(VAR, DICT, LOOKUP) {\\\n    static PY_UINT64_T __pyx_dict_version = 0;\\\n    static PyObject *__pyx_dict_cached_value = NULL;\\\n    if (likely(__PYX_GET_DICT_VERSION(DICT) == __pyx_dict_version)) {\\\n        (VAR) = __pyx_dict_cached_value;\\\n    } else {\\\n        (VAR) = __pyx_dict_cached_value = (LOOKUP);\\\n        __pyx_dict_version = __PYX_GET_DICT_VERSION(DICT);\\\n    }\\\n}\nstatic CYTHON_INLINE PY_UINT64_T __Pyx_get_tp_dict_version(PyObject *obj);\nstatic CYTHON_INLINE PY_UINT64_T __Pyx_get_object_dict_version(PyObject *obj);\nstatic CYTHON_INLINE int __Pyx_object_dict_version_matches(PyObject* obj, PY_UINT64_T tp_dict_version, PY_UINT64_T obj_dict_version);\n#else\n#define __PYX_GET_DICT_VERSION(dict)  (0)\n#define __PYX_UPDATE_DICT_CACHE(dict, value, cache_var, version_var)\n#define __PYX_PY_DICT_LOOKUP_IF_MODIFIED(VAR, DICT, LOOKUP)  (VAR) = (LOOKUP);\n#endif\n\n/* GetModuleGlobalName.proto */\n#if CYTHON_USE_DICT_VERSIONS\n#define __Pyx_GetModuleGlobalName(var, name)  {\\\n    static PY_UINT64_T __pyx_dict_version = 0;\\\n    static PyObject *__pyx_dict_cached_value = NULL;\\\n    (var) = (likely(__pyx_dict_version == __PYX_GET_DICT_VERSION(__pyx_d))) ?\\\n        (likely(__pyx_dict_cached_value) ? __Pyx_NewRef(__pyx_dict_cached_value) : __Pyx_GetBuiltinName(name)) :\\\n        __Pyx__GetModuleGlobalName(name, &__pyx_dict_version, &__pyx_dict_cached_value);\\\n}\n#define __Pyx_GetModuleGlobalNameUncached(var, name)  {\\\n    PY_UINT64_T __pyx_dict_version;\\\n    PyObject *__pyx_dict_cached_value;\\\n    (var) = __Pyx__GetModuleGlobalName(name, &__pyx_dict_version, &__pyx_dict_cached_value);\\\n}\nstatic PyObject *__Pyx__GetModuleGlobalName(PyObject *name, PY_UINT64_T *dict_version, PyObject **dict_cached_value);\n#else\n#define __Pyx_GetModuleGlobalName(var, name)  (var) = __Pyx__GetModuleGlobalName(name)\n#define __Pyx_GetModuleGlobalNameUncached(var, name)  (var) = __Pyx__GetModuleGlobalName(name)\nstatic CYTHON_INLINE PyObject *__Pyx__GetModuleGlobalName(PyObject *name);\n#endif\n\n/* RaiseTooManyValuesToUnpack.proto */\nstatic CYTHON_INLINE void __Pyx_RaiseTooManyValuesError(Py_ssize_t expected);\n\n/* RaiseNeedMoreValuesToUnpack.proto */\nstatic CYTHON_INLINE void __Pyx_RaiseNeedMoreValuesError(Py_ssize_t index);\n\n/* RaiseNoneIterError.proto */\nstatic CYTHON_INLINE void __Pyx_RaiseNoneNotIterableError(void);\n\n/* ExtTypeTest.proto */\nstatic CYTHON_INLINE int __Pyx_TypeTest(PyObject *obj, PyTypeObject *type);\n\n/* SwapException.proto */\n#if CYTHON_FAST_THREAD_STATE\n#define __Pyx_ExceptionSwap(type, value, tb)  __Pyx__ExceptionSwap(__pyx_tstate, type, value, tb)\nstatic CYTHON_INLINE void __Pyx__ExceptionSwap(PyThreadState *tstate, PyObject **type, PyObject **value, PyObject **tb);\n#else\nstatic CYTHON_INLINE void __Pyx_ExceptionSwap(PyObject **type, PyObject **value, PyObject **tb);\n#endif\n\n/* Import.proto */\nstatic PyObject *__Pyx_Import(PyObject *name, PyObject *from_list, int level);\n\n/* FastTypeChecks.proto */\n#if CYTHON_COMPILING_IN_CPYTHON\n#define __Pyx_TypeCheck(obj, type) __Pyx_IsSubtype(Py_TYPE(obj), (PyTypeObject *)type)\nstatic CYTHON_INLINE int __Pyx_IsSubtype(PyTypeObject *a, PyTypeObject *b);\nstatic CYTHON_INLINE int __Pyx_PyErr_GivenExceptionMatches(PyObject *err, PyObject *type);\nstatic CYTHON_INLINE int __Pyx_PyErr_GivenExceptionMatches2(PyObject *err, PyObject *type1, PyObject *type2);\n#else\n#define __Pyx_TypeCheck(obj, type) PyObject_TypeCheck(obj, (PyTypeObject *)type)\n#define __Pyx_PyErr_GivenExceptionMatches(err, type) PyErr_GivenExceptionMatches(err, type)\n#define __Pyx_PyErr_GivenExceptionMatches2(err, type1, type2) (PyErr_GivenExceptionMatches(err, type1) || PyErr_GivenExceptionMatches(err, type2))\n#endif\n#define __Pyx_PyException_Check(obj) __Pyx_TypeCheck(obj, PyExc_Exception)\n\nstatic CYTHON_UNUSED int __pyx_memoryview_getbuffer(PyObject *__pyx_v_self, Py_buffer *__pyx_v_info, int __pyx_v_flags); /*proto*/\n/* ListCompAppend.proto */\n#if CYTHON_USE_PYLIST_INTERNALS && CYTHON_ASSUME_SAFE_MACROS\nstatic CYTHON_INLINE int __Pyx_ListComp_Append(PyObject* list, PyObject* x) {\n    PyListObject* L = (PyListObject*) list;\n    Py_ssize_t len = Py_SIZE(list);\n    if (likely(L->allocated > len)) {\n        Py_INCREF(x);\n        PyList_SET_ITEM(list, len, x);\n        __Pyx_SET_SIZE(list, len + 1);\n        return 0;\n    }\n    return PyList_Append(list, x);\n}\n#else\n#define __Pyx_ListComp_Append(L,x) PyList_Append(L,x)\n#endif\n\n/* PyIntBinop.proto */\n#if !CYTHON_COMPILING_IN_PYPY\nstatic PyObject* __Pyx_PyInt_AddObjC(PyObject *op1, PyObject *op2, long intval, int inplace, int zerodivision_check);\n#else\n#define __Pyx_PyInt_AddObjC(op1, op2, intval, inplace, zerodivision_check)\\\n    (inplace ? PyNumber_InPlaceAdd(op1, op2) : PyNumber_Add(op1, op2))\n#endif\n\n/* ListExtend.proto */\nstatic CYTHON_INLINE int __Pyx_PyList_Extend(PyObject* L, PyObject* v) {\n#if CYTHON_COMPILING_IN_CPYTHON\n    PyObject* none = _PyList_Extend((PyListObject*)L, v);\n    if (unlikely(!none))\n        return -1;\n    Py_DECREF(none);\n    return 0;\n#else\n    return PyList_SetSlice(L, PY_SSIZE_T_MAX, PY_SSIZE_T_MAX, v);\n#endif\n}\n\n/* ListAppend.proto */\n#if CYTHON_USE_PYLIST_INTERNALS && CYTHON_ASSUME_SAFE_MACROS\nstatic CYTHON_INLINE int __Pyx_PyList_Append(PyObject* list, PyObject* x) {\n    PyListObject* L = (PyListObject*) list;\n    Py_ssize_t len = Py_SIZE(list);\n    if (likely(L->allocated > len) & likely(len > (L->allocated >> 1))) {\n        Py_INCREF(x);\n        PyList_SET_ITEM(list, len, x);\n        __Pyx_SET_SIZE(list, len + 1);\n        return 0;\n    }\n    return PyList_Append(list, x);\n}\n#else\n#define __Pyx_PyList_Append(L,x) PyList_Append(L,x)\n#endif\n\n/* DivInt[long].proto */\nstatic CYTHON_INLINE long __Pyx_div_long(long, long);\n\n/* ImportFrom.proto */\nstatic PyObject* __Pyx_ImportFrom(PyObject* module, PyObject* name);\n\n/* HasAttr.proto */\nstatic CYTHON_INLINE int __Pyx_HasAttr(PyObject *, PyObject *);\n\n/* PyObject_GenericGetAttrNoDict.proto */\n#if CYTHON_USE_TYPE_SLOTS && CYTHON_USE_PYTYPE_LOOKUP && PY_VERSION_HEX < 0x03070000\nstatic CYTHON_INLINE PyObject* __Pyx_PyObject_GenericGetAttrNoDict(PyObject* obj, PyObject* attr_name);\n#else\n#define __Pyx_PyObject_GenericGetAttrNoDict PyObject_GenericGetAttr\n#endif\n\n/* PyObject_GenericGetAttr.proto */\n#if CYTHON_USE_TYPE_SLOTS && CYTHON_USE_PYTYPE_LOOKUP && PY_VERSION_HEX < 0x03070000\nstatic PyObject* __Pyx_PyObject_GenericGetAttr(PyObject* obj, PyObject* attr_name);\n#else\n#define __Pyx_PyObject_GenericGetAttr PyObject_GenericGetAttr\n#endif\n\n/* SetVTable.proto */\nstatic int __Pyx_SetVtable(PyObject *dict, void *vtable);\n\n/* PyObjectGetAttrStrNoError.proto */\nstatic CYTHON_INLINE PyObject* __Pyx_PyObject_GetAttrStrNoError(PyObject* obj, PyObject* attr_name);\n\n/* SetupReduce.proto */\nstatic int __Pyx_setup_reduce(PyObject* type_obj);\n\n/* TypeImport.proto */\n#ifndef __PYX_HAVE_RT_ImportType_proto\n#define __PYX_HAVE_RT_ImportType_proto\nenum __Pyx_ImportType_CheckSize {\n   __Pyx_ImportType_CheckSize_Error = 0,\n   __Pyx_ImportType_CheckSize_Warn = 1,\n   __Pyx_ImportType_CheckSize_Ignore = 2\n};\nstatic PyTypeObject *__Pyx_ImportType(PyObject* module, const char *module_name, const char *class_name, size_t size, enum __Pyx_ImportType_CheckSize check_size);\n#endif\n\n/* CLineInTraceback.proto */\n#ifdef CYTHON_CLINE_IN_TRACEBACK\n#define __Pyx_CLineForTraceback(tstate, c_line)  (((CYTHON_CLINE_IN_TRACEBACK)) ? c_line : 0)\n#else\nstatic int __Pyx_CLineForTraceback(PyThreadState *tstate, int c_line);\n#endif\n\n/* CodeObjectCache.proto */\ntypedef struct {\n    PyCodeObject* code_object;\n    int code_line;\n} __Pyx_CodeObjectCacheEntry;\nstruct __Pyx_CodeObjectCache {\n    int count;\n    int max_count;\n    __Pyx_CodeObjectCacheEntry* entries;\n};\nstatic struct __Pyx_CodeObjectCache __pyx_code_cache = {0,0,NULL};\nstatic int __pyx_bisect_code_objects(__Pyx_CodeObjectCacheEntry* entries, int count, int code_line);\nstatic PyCodeObject *__pyx_find_code_object(int code_line);\nstatic void __pyx_insert_code_object(int code_line, PyCodeObject* code_object);\n\n/* AddTraceback.proto */\nstatic void __Pyx_AddTraceback(const char *funcname, int c_line,\n                               int py_line, const char *filename);\n\n#if PY_MAJOR_VERSION < 3\n    static int __Pyx_GetBuffer(PyObject *obj, Py_buffer *view, int flags);\n    static void __Pyx_ReleaseBuffer(Py_buffer *view);\n#else\n    #define __Pyx_GetBuffer PyObject_GetBuffer\n    #define __Pyx_ReleaseBuffer PyBuffer_Release\n#endif\n\n\n/* BufferStructDeclare.proto */\ntypedef struct {\n  Py_ssize_t shape, strides, suboffsets;\n} __Pyx_Buf_DimInfo;\ntypedef struct {\n  size_t refcount;\n  Py_buffer pybuffer;\n} __Pyx_Buffer;\ntypedef struct {\n  __Pyx_Buffer *rcbuffer;\n  char *data;\n  __Pyx_Buf_DimInfo diminfo[8];\n} __Pyx_LocalBuf_ND;\n\n/* MemviewSliceIsContig.proto */\nstatic int __pyx_memviewslice_is_contig(const __Pyx_memviewslice mvs, char order, int ndim);\n\n/* OverlappingSlices.proto */\nstatic int __pyx_slices_overlap(__Pyx_memviewslice *slice1,\n                                __Pyx_memviewslice *slice2,\n                                int ndim, size_t itemsize);\n\n/* Capsule.proto */\nstatic CYTHON_INLINE PyObject *__pyx_capsule_create(void *p, const char *sig);\n\n/* IsLittleEndian.proto */\nstatic CYTHON_INLINE int __Pyx_Is_Little_Endian(void);\n\n/* BufferFormatCheck.proto */\nstatic const char* __Pyx_BufFmt_CheckString(__Pyx_BufFmt_Context* ctx, const char* ts);\nstatic void __Pyx_BufFmt_Init(__Pyx_BufFmt_Context* ctx,\n                              __Pyx_BufFmt_StackElem* stack,\n                              __Pyx_TypeInfo* type);\n\n/* TypeInfoCompare.proto */\nstatic int __pyx_typeinfo_cmp(__Pyx_TypeInfo *a, __Pyx_TypeInfo *b);\n\n/* MemviewSliceValidateAndInit.proto */\nstatic int __Pyx_ValidateAndInit_memviewslice(\n                int *axes_specs,\n                int c_or_f_flag,\n                int buf_flags,\n                int ndim,\n                __Pyx_TypeInfo *dtype,\n                __Pyx_BufFmt_StackElem stack[],\n                __Pyx_memviewslice *memviewslice,\n                PyObject *original_obj);\n\n/* ObjectToMemviewSlice.proto */\nstatic CYTHON_INLINE __Pyx_memviewslice __Pyx_PyObject_to_MemoryviewSlice_d_d_dc_int(PyObject *, int writable_flag);\n\n/* ObjectToMemviewSlice.proto */\nstatic CYTHON_INLINE __Pyx_memviewslice __Pyx_PyObject_to_MemoryviewSlice_d_d_dc_float(PyObject *, int writable_flag);\n\n/* ObjectToMemviewSlice.proto */\nstatic CYTHON_INLINE __Pyx_memviewslice __Pyx_PyObject_to_MemoryviewSlice_dc_int(PyObject *, int writable_flag);\n\n/* GCCDiagnostics.proto */\n#if defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6))\n#define __Pyx_HAS_GCC_DIAGNOSTIC\n#endif\n\n/* RealImag.proto */\n#if CYTHON_CCOMPLEX\n  #ifdef __cplusplus\n    #define __Pyx_CREAL(z) ((z).real())\n    #define __Pyx_CIMAG(z) ((z).imag())\n  #else\n    #define __Pyx_CREAL(z) (__real__(z))\n    #define __Pyx_CIMAG(z) (__imag__(z))\n  #endif\n#else\n    #define __Pyx_CREAL(z) ((z).real)\n    #define __Pyx_CIMAG(z) ((z).imag)\n#endif\n#if defined(__cplusplus) && CYTHON_CCOMPLEX\\\n        && (defined(_WIN32) || defined(__clang__) || (defined(__GNUC__) && (__GNUC__ >= 5 || __GNUC__ == 4 && __GNUC_MINOR__ >= 4 )) || __cplusplus >= 201103)\n    #define __Pyx_SET_CREAL(z,x) ((z).real(x))\n    #define __Pyx_SET_CIMAG(z,y) ((z).imag(y))\n#else\n    #define __Pyx_SET_CREAL(z,x) __Pyx_CREAL(z) = (x)\n    #define __Pyx_SET_CIMAG(z,y) __Pyx_CIMAG(z) = (y)\n#endif\n\n/* Arithmetic.proto */\n#if CYTHON_CCOMPLEX\n    #define __Pyx_c_eq_float(a, b)   ((a)==(b))\n    #define __Pyx_c_sum_float(a, b)  ((a)+(b))\n    #define __Pyx_c_diff_float(a, b) ((a)-(b))\n    #define __Pyx_c_prod_float(a, b) ((a)*(b))\n    #define __Pyx_c_quot_float(a, b) ((a)/(b))\n    #define __Pyx_c_neg_float(a)     (-(a))\n  #ifdef __cplusplus\n    #define __Pyx_c_is_zero_float(z) ((z)==(float)0)\n    #define __Pyx_c_conj_float(z)    (::std::conj(z))\n    #if 1\n        #define __Pyx_c_abs_float(z)     (::std::abs(z))\n        #define __Pyx_c_pow_float(a, b)  (::std::pow(a, b))\n    #endif\n  #else\n    #define __Pyx_c_is_zero_float(z) ((z)==0)\n    #define __Pyx_c_conj_float(z)    (conjf(z))\n    #if 1\n        #define __Pyx_c_abs_float(z)     (cabsf(z))\n        #define __Pyx_c_pow_float(a, b)  (cpowf(a, b))\n    #endif\n #endif\n#else\n    static CYTHON_INLINE int __Pyx_c_eq_float(__pyx_t_float_complex, __pyx_t_float_complex);\n    static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_sum_float(__pyx_t_float_complex, __pyx_t_float_complex);\n    static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_diff_float(__pyx_t_float_complex, __pyx_t_float_complex);\n    static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_prod_float(__pyx_t_float_complex, __pyx_t_float_complex);\n    static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_quot_float(__pyx_t_float_complex, __pyx_t_float_complex);\n    static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_neg_float(__pyx_t_float_complex);\n    static CYTHON_INLINE int __Pyx_c_is_zero_float(__pyx_t_float_complex);\n    static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_conj_float(__pyx_t_float_complex);\n    #if 1\n        static CYTHON_INLINE float __Pyx_c_abs_float(__pyx_t_float_complex);\n        static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_pow_float(__pyx_t_float_complex, __pyx_t_float_complex);\n    #endif\n#endif\n\n/* Arithmetic.proto */\n#if CYTHON_CCOMPLEX\n    #define __Pyx_c_eq_double(a, b)   ((a)==(b))\n    #define __Pyx_c_sum_double(a, b)  ((a)+(b))\n    #define __Pyx_c_diff_double(a, b) ((a)-(b))\n    #define __Pyx_c_prod_double(a, b) ((a)*(b))\n    #define __Pyx_c_quot_double(a, b) ((a)/(b))\n    #define __Pyx_c_neg_double(a)     (-(a))\n  #ifdef __cplusplus\n    #define __Pyx_c_is_zero_double(z) ((z)==(double)0)\n    #define __Pyx_c_conj_double(z)    (::std::conj(z))\n    #if 1\n        #define __Pyx_c_abs_double(z)     (::std::abs(z))\n        #define __Pyx_c_pow_double(a, b)  (::std::pow(a, b))\n    #endif\n  #else\n    #define __Pyx_c_is_zero_double(z) ((z)==0)\n    #define __Pyx_c_conj_double(z)    (conj(z))\n    #if 1\n        #define __Pyx_c_abs_double(z)     (cabs(z))\n        #define __Pyx_c_pow_double(a, b)  (cpow(a, b))\n    #endif\n #endif\n#else\n    static CYTHON_INLINE int __Pyx_c_eq_double(__pyx_t_double_complex, __pyx_t_double_complex);\n    static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_sum_double(__pyx_t_double_complex, __pyx_t_double_complex);\n    static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_diff_double(__pyx_t_double_complex, __pyx_t_double_complex);\n    static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_prod_double(__pyx_t_double_complex, __pyx_t_double_complex);\n    static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_quot_double(__pyx_t_double_complex, __pyx_t_double_complex);\n    static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_neg_double(__pyx_t_double_complex);\n    static CYTHON_INLINE int __Pyx_c_is_zero_double(__pyx_t_double_complex);\n    static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_conj_double(__pyx_t_double_complex);\n    #if 1\n        static CYTHON_INLINE double __Pyx_c_abs_double(__pyx_t_double_complex);\n        static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_pow_double(__pyx_t_double_complex, __pyx_t_double_complex);\n    #endif\n#endif\n\n/* MemviewSliceCopyTemplate.proto */\nstatic __Pyx_memviewslice\n__pyx_memoryview_copy_new_contig(const __Pyx_memviewslice *from_mvs,\n                                 const char *mode, int ndim,\n                                 size_t sizeof_dtype, int contig_flag,\n                                 int dtype_is_object);\n\n/* CIntToPy.proto */\nstatic CYTHON_INLINE PyObject* __Pyx_PyInt_From_int(int value);\n\n/* CIntFromPy.proto */\nstatic CYTHON_INLINE int __Pyx_PyInt_As_int(PyObject *);\n\n/* CIntToPy.proto */\nstatic CYTHON_INLINE PyObject* __Pyx_PyInt_From_long(long value);\n\n/* CIntFromPy.proto */\nstatic CYTHON_INLINE long __Pyx_PyInt_As_long(PyObject *);\n\n/* CIntFromPy.proto */\nstatic CYTHON_INLINE char __Pyx_PyInt_As_char(PyObject *);\n\n/* CheckBinaryVersion.proto */\nstatic int __Pyx_check_binary_version(void);\n\n/* InitStrings.proto */\nstatic int __Pyx_InitStrings(__Pyx_StringTabEntry *t);\n\nstatic PyObject *__pyx_array_get_memview(struct __pyx_array_obj *__pyx_v_self); /* proto*/\nstatic char *__pyx_memoryview_get_item_pointer(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_index); /* proto*/\nstatic PyObject *__pyx_memoryview_is_slice(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_obj); /* proto*/\nstatic PyObject *__pyx_memoryview_setitem_slice_assignment(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_dst, PyObject *__pyx_v_src); /* proto*/\nstatic PyObject *__pyx_memoryview_setitem_slice_assign_scalar(struct __pyx_memoryview_obj *__pyx_v_self, struct __pyx_memoryview_obj *__pyx_v_dst, PyObject *__pyx_v_value); /* proto*/\nstatic PyObject *__pyx_memoryview_setitem_indexed(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_index, PyObject *__pyx_v_value); /* proto*/\nstatic PyObject *__pyx_memoryview_convert_item_to_object(struct __pyx_memoryview_obj *__pyx_v_self, char *__pyx_v_itemp); /* proto*/\nstatic PyObject *__pyx_memoryview_assign_item_from_object(struct __pyx_memoryview_obj *__pyx_v_self, char *__pyx_v_itemp, PyObject *__pyx_v_value); /* proto*/\nstatic PyObject *__pyx_memoryviewslice_convert_item_to_object(struct __pyx_memoryviewslice_obj *__pyx_v_self, char *__pyx_v_itemp); /* proto*/\nstatic PyObject *__pyx_memoryviewslice_assign_item_from_object(struct __pyx_memoryviewslice_obj *__pyx_v_self, char *__pyx_v_itemp, PyObject *__pyx_v_value); /* proto*/\n\n/* Module declarations from 'cython.view' */\n\n/* Module declarations from 'cython' */\n\n/* Module declarations from 'cpython.buffer' */\n\n/* Module declarations from 'libc.string' */\n\n/* Module declarations from 'libc.stdio' */\n\n/* Module declarations from '__builtin__' */\n\n/* Module declarations from 'cpython.type' */\nstatic PyTypeObject *__pyx_ptype_7cpython_4type_type = 0;\n\n/* Module declarations from 'cpython' */\n\n/* Module declarations from 'cpython.object' */\n\n/* Module declarations from 'cpython.ref' */\n\n/* Module declarations from 'cpython.mem' */\n\n/* Module declarations from 'numpy' */\n\n/* Module declarations from 'numpy' */\nstatic PyTypeObject *__pyx_ptype_5numpy_dtype = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_flatiter = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_broadcast = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_ndarray = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_generic = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_number = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_integer = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_signedinteger = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_unsignedinteger = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_inexact = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_floating = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_complexfloating = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_flexible = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_character = 0;\nstatic PyTypeObject *__pyx_ptype_5numpy_ufunc = 0;\n\n/* Module declarations from 'TTS.tts.utils.monotonic_align.core' */\nstatic PyTypeObject *__pyx_array_type = 0;\nstatic PyTypeObject *__pyx_MemviewEnum_type = 0;\nstatic PyTypeObject *__pyx_memoryview_type = 0;\nstatic PyTypeObject *__pyx_memoryviewslice_type = 0;\nstatic PyObject *generic = 0;\nstatic PyObject *strided = 0;\nstatic PyObject *indirect = 0;\nstatic PyObject *contiguous = 0;\nstatic PyObject *indirect_contiguous = 0;\nstatic int __pyx_memoryview_thread_locks_used;\nstatic PyThread_type_lock __pyx_memoryview_thread_locks[8];\nstatic void __pyx_f_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_each(__Pyx_memviewslice, __Pyx_memviewslice, int, int, float); /*proto*/\nstatic void __pyx_f_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_c(__Pyx_memviewslice, __Pyx_memviewslice, __Pyx_memviewslice, __Pyx_memviewslice, int __pyx_skip_dispatch, struct __pyx_opt_args_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_c *__pyx_optional_args); /*proto*/\nstatic struct __pyx_array_obj *__pyx_array_new(PyObject *, Py_ssize_t, char *, char *, char *); /*proto*/\nstatic void *__pyx_align_pointer(void *, size_t); /*proto*/\nstatic PyObject *__pyx_memoryview_new(PyObject *, int, int, __Pyx_TypeInfo *); /*proto*/\nstatic CYTHON_INLINE int __pyx_memoryview_check(PyObject *); /*proto*/\nstatic PyObject *_unellipsify(PyObject *, int); /*proto*/\nstatic PyObject *assert_direct_dimensions(Py_ssize_t *, int); /*proto*/\nstatic struct __pyx_memoryview_obj *__pyx_memview_slice(struct __pyx_memoryview_obj *, PyObject *); /*proto*/\nstatic int __pyx_memoryview_slice_memviewslice(__Pyx_memviewslice *, Py_ssize_t, Py_ssize_t, Py_ssize_t, int, int, int *, Py_ssize_t, Py_ssize_t, Py_ssize_t, int, int, int, int); /*proto*/\nstatic char *__pyx_pybuffer_index(Py_buffer *, char *, Py_ssize_t, Py_ssize_t); /*proto*/\nstatic int __pyx_memslice_transpose(__Pyx_memviewslice *); /*proto*/\nstatic PyObject *__pyx_memoryview_fromslice(__Pyx_memviewslice, int, PyObject *(*)(char *), int (*)(char *, PyObject *), int); /*proto*/\nstatic __Pyx_memviewslice *__pyx_memoryview_get_slice_from_memoryview(struct __pyx_memoryview_obj *, __Pyx_memviewslice *); /*proto*/\nstatic void __pyx_memoryview_slice_copy(struct __pyx_memoryview_obj *, __Pyx_memviewslice *); /*proto*/\nstatic PyObject *__pyx_memoryview_copy_object(struct __pyx_memoryview_obj *); /*proto*/\nstatic PyObject *__pyx_memoryview_copy_object_from_slice(struct __pyx_memoryview_obj *, __Pyx_memviewslice *); /*proto*/\nstatic Py_ssize_t abs_py_ssize_t(Py_ssize_t); /*proto*/\nstatic char __pyx_get_best_slice_order(__Pyx_memviewslice *, int); /*proto*/\nstatic void _copy_strided_to_strided(char *, Py_ssize_t *, char *, Py_ssize_t *, Py_ssize_t *, Py_ssize_t *, int, size_t); /*proto*/\nstatic void copy_strided_to_strided(__Pyx_memviewslice *, __Pyx_memviewslice *, int, size_t); /*proto*/\nstatic Py_ssize_t __pyx_memoryview_slice_get_size(__Pyx_memviewslice *, int); /*proto*/\nstatic Py_ssize_t __pyx_fill_contig_strides_array(Py_ssize_t *, Py_ssize_t *, Py_ssize_t, int, char); /*proto*/\nstatic void *__pyx_memoryview_copy_data_to_temp(__Pyx_memviewslice *, __Pyx_memviewslice *, char, int); /*proto*/\nstatic int __pyx_memoryview_err_extents(int, Py_ssize_t, Py_ssize_t); /*proto*/\nstatic int __pyx_memoryview_err_dim(PyObject *, char *, int); /*proto*/\nstatic int __pyx_memoryview_err(PyObject *, char *); /*proto*/\nstatic int __pyx_memoryview_copy_contents(__Pyx_memviewslice, __Pyx_memviewslice, int, int, int); /*proto*/\nstatic void __pyx_memoryview_broadcast_leading(__Pyx_memviewslice *, int, int); /*proto*/\nstatic void __pyx_memoryview_refcount_copying(__Pyx_memviewslice *, int, int, int); /*proto*/\nstatic void __pyx_memoryview_refcount_objects_in_slice_with_gil(char *, Py_ssize_t *, Py_ssize_t *, int, int); /*proto*/\nstatic void __pyx_memoryview_refcount_objects_in_slice(char *, Py_ssize_t *, Py_ssize_t *, int, int); /*proto*/\nstatic void __pyx_memoryview_slice_assign_scalar(__Pyx_memviewslice *, int, size_t, void *, int); /*proto*/\nstatic void __pyx_memoryview__slice_assign_scalar(char *, Py_ssize_t *, Py_ssize_t *, int, size_t, void *); /*proto*/\nstatic PyObject *__pyx_unpickle_Enum__set_state(struct __pyx_MemviewEnum_obj *, PyObject *); /*proto*/\nstatic __Pyx_TypeInfo __Pyx_TypeInfo_int = { \"int\", NULL, sizeof(int), { 0 }, 0, IS_UNSIGNED(int) ? 'U' : 'I', IS_UNSIGNED(int), 0 };\nstatic __Pyx_TypeInfo __Pyx_TypeInfo_float = { \"float\", NULL, sizeof(float), { 0 }, 0, 'R', 0, 0 };\n#define __Pyx_MODULE_NAME \"TTS.tts.utils.monotonic_align.core\"\nextern int __pyx_module_is_main_TTS__tts__utils__monotonic_align__core;\nint __pyx_module_is_main_TTS__tts__utils__monotonic_align__core = 0;\n\n/* Implementation of 'TTS.tts.utils.monotonic_align.core' */\nstatic PyObject *__pyx_builtin_range;\nstatic PyObject *__pyx_builtin_ImportError;\nstatic PyObject *__pyx_builtin_ValueError;\nstatic PyObject *__pyx_builtin_MemoryError;\nstatic PyObject *__pyx_builtin_enumerate;\nstatic PyObject *__pyx_builtin_TypeError;\nstatic PyObject *__pyx_builtin_Ellipsis;\nstatic PyObject *__pyx_builtin_id;\nstatic PyObject *__pyx_builtin_IndexError;\nstatic const char __pyx_k_O[] = \"O\";\nstatic const char __pyx_k_c[] = \"c\";\nstatic const char __pyx_k_id[] = \"id\";\nstatic const char __pyx_k_np[] = \"np\";\nstatic const char __pyx_k_new[] = \"__new__\";\nstatic const char __pyx_k_obj[] = \"obj\";\nstatic const char __pyx_k_base[] = \"base\";\nstatic const char __pyx_k_dict[] = \"__dict__\";\nstatic const char __pyx_k_main[] = \"__main__\";\nstatic const char __pyx_k_mode[] = \"mode\";\nstatic const char __pyx_k_name[] = \"name\";\nstatic const char __pyx_k_ndim[] = \"ndim\";\nstatic const char __pyx_k_pack[] = \"pack\";\nstatic const char __pyx_k_size[] = \"size\";\nstatic const char __pyx_k_step[] = \"step\";\nstatic const char __pyx_k_stop[] = \"stop\";\nstatic const char __pyx_k_t_xs[] = \"t_xs\";\nstatic const char __pyx_k_t_ys[] = \"t_ys\";\nstatic const char __pyx_k_test[] = \"__test__\";\nstatic const char __pyx_k_ASCII[] = \"ASCII\";\nstatic const char __pyx_k_class[] = \"__class__\";\nstatic const char __pyx_k_error[] = \"error\";\nstatic const char __pyx_k_flags[] = \"flags\";\nstatic const char __pyx_k_numpy[] = \"numpy\";\nstatic const char __pyx_k_paths[] = \"paths\";\nstatic const char __pyx_k_range[] = \"range\";\nstatic const char __pyx_k_shape[] = \"shape\";\nstatic const char __pyx_k_start[] = \"start\";\nstatic const char __pyx_k_encode[] = \"encode\";\nstatic const char __pyx_k_format[] = \"format\";\nstatic const char __pyx_k_import[] = \"__import__\";\nstatic const char __pyx_k_name_2[] = \"__name__\";\nstatic const char __pyx_k_pickle[] = \"pickle\";\nstatic const char __pyx_k_reduce[] = \"__reduce__\";\nstatic const char __pyx_k_struct[] = \"struct\";\nstatic const char __pyx_k_unpack[] = \"unpack\";\nstatic const char __pyx_k_update[] = \"update\";\nstatic const char __pyx_k_values[] = \"values\";\nstatic const char __pyx_k_fortran[] = \"fortran\";\nstatic const char __pyx_k_memview[] = \"memview\";\nstatic const char __pyx_k_Ellipsis[] = \"Ellipsis\";\nstatic const char __pyx_k_getstate[] = \"__getstate__\";\nstatic const char __pyx_k_itemsize[] = \"itemsize\";\nstatic const char __pyx_k_pyx_type[] = \"__pyx_type\";\nstatic const char __pyx_k_setstate[] = \"__setstate__\";\nstatic const char __pyx_k_TypeError[] = \"TypeError\";\nstatic const char __pyx_k_enumerate[] = \"enumerate\";\nstatic const char __pyx_k_pyx_state[] = \"__pyx_state\";\nstatic const char __pyx_k_reduce_ex[] = \"__reduce_ex__\";\nstatic const char __pyx_k_IndexError[] = \"IndexError\";\nstatic const char __pyx_k_ValueError[] = \"ValueError\";\nstatic const char __pyx_k_pyx_result[] = \"__pyx_result\";\nstatic const char __pyx_k_pyx_vtable[] = \"__pyx_vtable__\";\nstatic const char __pyx_k_ImportError[] = \"ImportError\";\nstatic const char __pyx_k_MemoryError[] = \"MemoryError\";\nstatic const char __pyx_k_PickleError[] = \"PickleError\";\nstatic const char __pyx_k_max_neg_val[] = \"max_neg_val\";\nstatic const char __pyx_k_pyx_checksum[] = \"__pyx_checksum\";\nstatic const char __pyx_k_stringsource[] = \"stringsource\";\nstatic const char __pyx_k_pyx_getbuffer[] = \"__pyx_getbuffer\";\nstatic const char __pyx_k_reduce_cython[] = \"__reduce_cython__\";\nstatic const char __pyx_k_View_MemoryView[] = \"View.MemoryView\";\nstatic const char __pyx_k_allocate_buffer[] = \"allocate_buffer\";\nstatic const char __pyx_k_dtype_is_object[] = \"dtype_is_object\";\nstatic const char __pyx_k_pyx_PickleError[] = \"__pyx_PickleError\";\nstatic const char __pyx_k_setstate_cython[] = \"__setstate_cython__\";\nstatic const char __pyx_k_pyx_unpickle_Enum[] = \"__pyx_unpickle_Enum\";\nstatic const char __pyx_k_cline_in_traceback[] = \"cline_in_traceback\";\nstatic const char __pyx_k_strided_and_direct[] = \"<strided and direct>\";\nstatic const char __pyx_k_strided_and_indirect[] = \"<strided and indirect>\";\nstatic const char __pyx_k_contiguous_and_direct[] = \"<contiguous and direct>\";\nstatic const char __pyx_k_MemoryView_of_r_object[] = \"<MemoryView of %r object>\";\nstatic const char __pyx_k_MemoryView_of_r_at_0x_x[] = \"<MemoryView of %r at 0x%x>\";\nstatic const char __pyx_k_contiguous_and_indirect[] = \"<contiguous and indirect>\";\nstatic const char __pyx_k_Cannot_index_with_type_s[] = \"Cannot index with type '%s'\";\nstatic const char __pyx_k_Invalid_shape_in_axis_d_d[] = \"Invalid shape in axis %d: %d.\";\nstatic const char __pyx_k_itemsize_0_for_cython_array[] = \"itemsize <= 0 for cython.array\";\nstatic const char __pyx_k_unable_to_allocate_array_data[] = \"unable to allocate array data.\";\nstatic const char __pyx_k_strided_and_direct_or_indirect[] = \"<strided and direct or indirect>\";\nstatic const char __pyx_k_numpy_core_multiarray_failed_to[] = \"numpy.core.multiarray failed to import\";\nstatic const char __pyx_k_Buffer_view_does_not_expose_stri[] = \"Buffer view does not expose strides\";\nstatic const char __pyx_k_Can_only_create_a_buffer_that_is[] = \"Can only create a buffer that is contiguous in memory.\";\nstatic const char __pyx_k_Cannot_assign_to_read_only_memor[] = \"Cannot assign to read-only memoryview\";\nstatic const char __pyx_k_Cannot_create_writable_memory_vi[] = \"Cannot create writable memory view from read-only memoryview\";\nstatic const char __pyx_k_Empty_shape_tuple_for_cython_arr[] = \"Empty shape tuple for cython.array\";\nstatic const char __pyx_k_Incompatible_checksums_s_vs_0xb0[] = \"Incompatible checksums (%s vs 0xb068931 = (name))\";\nstatic const char __pyx_k_Indirect_dimensions_not_supporte[] = \"Indirect dimensions not supported\";\nstatic const char __pyx_k_Invalid_mode_expected_c_or_fortr[] = \"Invalid mode, expected 'c' or 'fortran', got %s\";\nstatic const char __pyx_k_Out_of_bounds_on_buffer_access_a[] = \"Out of bounds on buffer access (axis %d)\";\nstatic const char __pyx_k_Unable_to_convert_item_to_object[] = \"Unable to convert item to object\";\nstatic const char __pyx_k_got_differing_extents_in_dimensi[] = \"got differing extents in dimension %d (got %d and %d)\";\nstatic const char __pyx_k_no_default___reduce___due_to_non[] = \"no default __reduce__ due to non-trivial __cinit__\";\nstatic const char __pyx_k_numpy_core_umath_failed_to_impor[] = \"numpy.core.umath failed to import\";\nstatic const char __pyx_k_unable_to_allocate_shape_and_str[] = \"unable to allocate shape and strides.\";\nstatic PyObject *__pyx_n_s_ASCII;\nstatic PyObject *__pyx_kp_s_Buffer_view_does_not_expose_stri;\nstatic PyObject *__pyx_kp_s_Can_only_create_a_buffer_that_is;\nstatic PyObject *__pyx_kp_s_Cannot_assign_to_read_only_memor;\nstatic PyObject *__pyx_kp_s_Cannot_create_writable_memory_vi;\nstatic PyObject *__pyx_kp_s_Cannot_index_with_type_s;\nstatic PyObject *__pyx_n_s_Ellipsis;\nstatic PyObject *__pyx_kp_s_Empty_shape_tuple_for_cython_arr;\nstatic PyObject *__pyx_n_s_ImportError;\nstatic PyObject *__pyx_kp_s_Incompatible_checksums_s_vs_0xb0;\nstatic PyObject *__pyx_n_s_IndexError;\nstatic PyObject *__pyx_kp_s_Indirect_dimensions_not_supporte;\nstatic PyObject *__pyx_kp_s_Invalid_mode_expected_c_or_fortr;\nstatic PyObject *__pyx_kp_s_Invalid_shape_in_axis_d_d;\nstatic PyObject *__pyx_n_s_MemoryError;\nstatic PyObject *__pyx_kp_s_MemoryView_of_r_at_0x_x;\nstatic PyObject *__pyx_kp_s_MemoryView_of_r_object;\nstatic PyObject *__pyx_n_b_O;\nstatic PyObject *__pyx_kp_s_Out_of_bounds_on_buffer_access_a;\nstatic PyObject *__pyx_n_s_PickleError;\nstatic PyObject *__pyx_n_s_TypeError;\nstatic PyObject *__pyx_kp_s_Unable_to_convert_item_to_object;\nstatic PyObject *__pyx_n_s_ValueError;\nstatic PyObject *__pyx_n_s_View_MemoryView;\nstatic PyObject *__pyx_n_s_allocate_buffer;\nstatic PyObject *__pyx_n_s_base;\nstatic PyObject *__pyx_n_s_c;\nstatic PyObject *__pyx_n_u_c;\nstatic PyObject *__pyx_n_s_class;\nstatic PyObject *__pyx_n_s_cline_in_traceback;\nstatic PyObject *__pyx_kp_s_contiguous_and_direct;\nstatic PyObject *__pyx_kp_s_contiguous_and_indirect;\nstatic PyObject *__pyx_n_s_dict;\nstatic PyObject *__pyx_n_s_dtype_is_object;\nstatic PyObject *__pyx_n_s_encode;\nstatic PyObject *__pyx_n_s_enumerate;\nstatic PyObject *__pyx_n_s_error;\nstatic PyObject *__pyx_n_s_flags;\nstatic PyObject *__pyx_n_s_format;\nstatic PyObject *__pyx_n_s_fortran;\nstatic PyObject *__pyx_n_u_fortran;\nstatic PyObject *__pyx_n_s_getstate;\nstatic PyObject *__pyx_kp_s_got_differing_extents_in_dimensi;\nstatic PyObject *__pyx_n_s_id;\nstatic PyObject *__pyx_n_s_import;\nstatic PyObject *__pyx_n_s_itemsize;\nstatic PyObject *__pyx_kp_s_itemsize_0_for_cython_array;\nstatic PyObject *__pyx_n_s_main;\nstatic PyObject *__pyx_n_s_max_neg_val;\nstatic PyObject *__pyx_n_s_memview;\nstatic PyObject *__pyx_n_s_mode;\nstatic PyObject *__pyx_n_s_name;\nstatic PyObject *__pyx_n_s_name_2;\nstatic PyObject *__pyx_n_s_ndim;\nstatic PyObject *__pyx_n_s_new;\nstatic PyObject *__pyx_kp_s_no_default___reduce___due_to_non;\nstatic PyObject *__pyx_n_s_np;\nstatic PyObject *__pyx_n_s_numpy;\nstatic PyObject *__pyx_kp_u_numpy_core_multiarray_failed_to;\nstatic PyObject *__pyx_kp_u_numpy_core_umath_failed_to_impor;\nstatic PyObject *__pyx_n_s_obj;\nstatic PyObject *__pyx_n_s_pack;\nstatic PyObject *__pyx_n_s_paths;\nstatic PyObject *__pyx_n_s_pickle;\nstatic PyObject *__pyx_n_s_pyx_PickleError;\nstatic PyObject *__pyx_n_s_pyx_checksum;\nstatic PyObject *__pyx_n_s_pyx_getbuffer;\nstatic PyObject *__pyx_n_s_pyx_result;\nstatic PyObject *__pyx_n_s_pyx_state;\nstatic PyObject *__pyx_n_s_pyx_type;\nstatic PyObject *__pyx_n_s_pyx_unpickle_Enum;\nstatic PyObject *__pyx_n_s_pyx_vtable;\nstatic PyObject *__pyx_n_s_range;\nstatic PyObject *__pyx_n_s_reduce;\nstatic PyObject *__pyx_n_s_reduce_cython;\nstatic PyObject *__pyx_n_s_reduce_ex;\nstatic PyObject *__pyx_n_s_setstate;\nstatic PyObject *__pyx_n_s_setstate_cython;\nstatic PyObject *__pyx_n_s_shape;\nstatic PyObject *__pyx_n_s_size;\nstatic PyObject *__pyx_n_s_start;\nstatic PyObject *__pyx_n_s_step;\nstatic PyObject *__pyx_n_s_stop;\nstatic PyObject *__pyx_kp_s_strided_and_direct;\nstatic PyObject *__pyx_kp_s_strided_and_direct_or_indirect;\nstatic PyObject *__pyx_kp_s_strided_and_indirect;\nstatic PyObject *__pyx_kp_s_stringsource;\nstatic PyObject *__pyx_n_s_struct;\nstatic PyObject *__pyx_n_s_t_xs;\nstatic PyObject *__pyx_n_s_t_ys;\nstatic PyObject *__pyx_n_s_test;\nstatic PyObject *__pyx_kp_s_unable_to_allocate_array_data;\nstatic PyObject *__pyx_kp_s_unable_to_allocate_shape_and_str;\nstatic PyObject *__pyx_n_s_unpack;\nstatic PyObject *__pyx_n_s_update;\nstatic PyObject *__pyx_n_s_values;\nstatic PyObject *__pyx_pf_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_c(CYTHON_UNUSED PyObject *__pyx_self, __Pyx_memviewslice __pyx_v_paths, __Pyx_memviewslice __pyx_v_values, __Pyx_memviewslice __pyx_v_t_xs, __Pyx_memviewslice __pyx_v_t_ys, float __pyx_v_max_neg_val); /* proto */\nstatic int __pyx_array___pyx_pf_15View_dot_MemoryView_5array___cinit__(struct __pyx_array_obj *__pyx_v_self, PyObject *__pyx_v_shape, Py_ssize_t __pyx_v_itemsize, PyObject *__pyx_v_format, PyObject *__pyx_v_mode, int __pyx_v_allocate_buffer); /* proto */\nstatic int __pyx_array___pyx_pf_15View_dot_MemoryView_5array_2__getbuffer__(struct __pyx_array_obj *__pyx_v_self, Py_buffer *__pyx_v_info, int __pyx_v_flags); /* proto */\nstatic void __pyx_array___pyx_pf_15View_dot_MemoryView_5array_4__dealloc__(struct __pyx_array_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_5array_7memview___get__(struct __pyx_array_obj *__pyx_v_self); /* proto */\nstatic Py_ssize_t __pyx_array___pyx_pf_15View_dot_MemoryView_5array_6__len__(struct __pyx_array_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_array___pyx_pf_15View_dot_MemoryView_5array_8__getattr__(struct __pyx_array_obj *__pyx_v_self, PyObject *__pyx_v_attr); /* proto */\nstatic PyObject *__pyx_array___pyx_pf_15View_dot_MemoryView_5array_10__getitem__(struct __pyx_array_obj *__pyx_v_self, PyObject *__pyx_v_item); /* proto */\nstatic int __pyx_array___pyx_pf_15View_dot_MemoryView_5array_12__setitem__(struct __pyx_array_obj *__pyx_v_self, PyObject *__pyx_v_item, PyObject *__pyx_v_value); /* proto */\nstatic PyObject *__pyx_pf___pyx_array___reduce_cython__(CYTHON_UNUSED struct __pyx_array_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf___pyx_array_2__setstate_cython__(CYTHON_UNUSED struct __pyx_array_obj *__pyx_v_self, CYTHON_UNUSED PyObject *__pyx_v___pyx_state); /* proto */\nstatic int __pyx_MemviewEnum___pyx_pf_15View_dot_MemoryView_4Enum___init__(struct __pyx_MemviewEnum_obj *__pyx_v_self, PyObject *__pyx_v_name); /* proto */\nstatic PyObject *__pyx_MemviewEnum___pyx_pf_15View_dot_MemoryView_4Enum_2__repr__(struct __pyx_MemviewEnum_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf___pyx_MemviewEnum___reduce_cython__(struct __pyx_MemviewEnum_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf___pyx_MemviewEnum_2__setstate_cython__(struct __pyx_MemviewEnum_obj *__pyx_v_self, PyObject *__pyx_v___pyx_state); /* proto */\nstatic int __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview___cinit__(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_obj, int __pyx_v_flags, int __pyx_v_dtype_is_object); /* proto */\nstatic void __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_2__dealloc__(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_4__getitem__(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_index); /* proto */\nstatic int __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_6__setitem__(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_index, PyObject *__pyx_v_value); /* proto */\nstatic int __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_8__getbuffer__(struct __pyx_memoryview_obj *__pyx_v_self, Py_buffer *__pyx_v_info, int __pyx_v_flags); /* proto */\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_1T___get__(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_4base___get__(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_5shape___get__(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_7strides___get__(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_10suboffsets___get__(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_4ndim___get__(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_8itemsize___get__(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_6nbytes___get__(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_4size___get__(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic Py_ssize_t __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_10__len__(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_12__repr__(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_14__str__(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_16is_c_contig(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_18is_f_contig(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_20copy(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_22copy_fortran(struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf___pyx_memoryview___reduce_cython__(CYTHON_UNUSED struct __pyx_memoryview_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf___pyx_memoryview_2__setstate_cython__(CYTHON_UNUSED struct __pyx_memoryview_obj *__pyx_v_self, CYTHON_UNUSED PyObject *__pyx_v___pyx_state); /* proto */\nstatic void __pyx_memoryviewslice___pyx_pf_15View_dot_MemoryView_16_memoryviewslice___dealloc__(struct __pyx_memoryviewslice_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_16_memoryviewslice_4base___get__(struct __pyx_memoryviewslice_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf___pyx_memoryviewslice___reduce_cython__(CYTHON_UNUSED struct __pyx_memoryviewslice_obj *__pyx_v_self); /* proto */\nstatic PyObject *__pyx_pf___pyx_memoryviewslice_2__setstate_cython__(CYTHON_UNUSED struct __pyx_memoryviewslice_obj *__pyx_v_self, CYTHON_UNUSED PyObject *__pyx_v___pyx_state); /* proto */\nstatic PyObject *__pyx_pf_15View_dot_MemoryView___pyx_unpickle_Enum(CYTHON_UNUSED PyObject *__pyx_self, PyObject *__pyx_v___pyx_type, long __pyx_v___pyx_checksum, PyObject *__pyx_v___pyx_state); /* proto */\nstatic PyObject *__pyx_tp_new_array(PyTypeObject *t, PyObject *a, PyObject *k); /*proto*/\nstatic PyObject *__pyx_tp_new_Enum(PyTypeObject *t, PyObject *a, PyObject *k); /*proto*/\nstatic PyObject *__pyx_tp_new_memoryview(PyTypeObject *t, PyObject *a, PyObject *k); /*proto*/\nstatic PyObject *__pyx_tp_new__memoryviewslice(PyTypeObject *t, PyObject *a, PyObject *k); /*proto*/\nstatic PyObject *__pyx_int_0;\nstatic PyObject *__pyx_int_1;\nstatic PyObject *__pyx_int_184977713;\nstatic PyObject *__pyx_int_neg_1;\nstatic float __pyx_k_;\nstatic PyObject *__pyx_tuple__2;\nstatic PyObject *__pyx_tuple__3;\nstatic PyObject *__pyx_tuple__4;\nstatic PyObject *__pyx_tuple__5;\nstatic PyObject *__pyx_tuple__6;\nstatic PyObject *__pyx_tuple__7;\nstatic PyObject *__pyx_tuple__8;\nstatic PyObject *__pyx_tuple__9;\nstatic PyObject *__pyx_slice__18;\nstatic PyObject *__pyx_tuple__10;\nstatic PyObject *__pyx_tuple__11;\nstatic PyObject *__pyx_tuple__12;\nstatic PyObject *__pyx_tuple__13;\nstatic PyObject *__pyx_tuple__14;\nstatic PyObject *__pyx_tuple__15;\nstatic PyObject *__pyx_tuple__16;\nstatic PyObject *__pyx_tuple__17;\nstatic PyObject *__pyx_tuple__19;\nstatic PyObject *__pyx_tuple__20;\nstatic PyObject *__pyx_tuple__21;\nstatic PyObject *__pyx_tuple__22;\nstatic PyObject *__pyx_tuple__23;\nstatic PyObject *__pyx_tuple__24;\nstatic PyObject *__pyx_tuple__25;\nstatic PyObject *__pyx_tuple__26;\nstatic PyObject *__pyx_tuple__27;\nstatic PyObject *__pyx_codeobj__28;\n/* Late includes */\n\n/* \"TTS/tts/utils/monotonic_align/core.pyx\":11\n * @cython.boundscheck(False)\n * @cython.wraparound(False)\n * cdef void maximum_path_each(int[:,::1] path, float[:,::1] value, int t_x, int t_y, float max_neg_val) nogil:             # <<<<<<<<<<<<<<\n *   cdef int x\n *   cdef int y\n */\n\nstatic void __pyx_f_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_each(__Pyx_memviewslice __pyx_v_path, __Pyx_memviewslice __pyx_v_value, int __pyx_v_t_x, int __pyx_v_t_y, float __pyx_v_max_neg_val) {\n  int __pyx_v_x;\n  int __pyx_v_y;\n  float __pyx_v_v_prev;\n  float __pyx_v_v_cur;\n  int __pyx_v_index;\n  int __pyx_t_1;\n  int __pyx_t_2;\n  int __pyx_t_3;\n  long __pyx_t_4;\n  int __pyx_t_5;\n  long __pyx_t_6;\n  long __pyx_t_7;\n  int __pyx_t_8;\n  Py_ssize_t __pyx_t_9;\n  Py_ssize_t __pyx_t_10;\n  float __pyx_t_11;\n  float __pyx_t_12;\n  float __pyx_t_13;\n  Py_ssize_t __pyx_t_14;\n  Py_ssize_t __pyx_t_15;\n  int __pyx_t_16;\n\n  /* \"TTS/tts/utils/monotonic_align/core.pyx\":17\n *   cdef float v_cur\n *   cdef float tmp\n *   cdef int index = t_x - 1             # <<<<<<<<<<<<<<\n * \n *   for y in range(t_y):\n */\n  __pyx_v_index = (__pyx_v_t_x - 1);\n\n  /* \"TTS/tts/utils/monotonic_align/core.pyx\":19\n *   cdef int index = t_x - 1\n * \n *   for y in range(t_y):             # <<<<<<<<<<<<<<\n *     for x in range(max(0, t_x + y - t_y), min(t_x, y + 1)):\n *       if x == y:\n */\n  __pyx_t_1 = __pyx_v_t_y;\n  __pyx_t_2 = __pyx_t_1;\n  for (__pyx_t_3 = 0; __pyx_t_3 < __pyx_t_2; __pyx_t_3+=1) {\n    __pyx_v_y = __pyx_t_3;\n\n    /* \"TTS/tts/utils/monotonic_align/core.pyx\":20\n * \n *   for y in range(t_y):\n *     for x in range(max(0, t_x + y - t_y), min(t_x, y + 1)):             # <<<<<<<<<<<<<<\n *       if x == y:\n *         v_cur = max_neg_val\n */\n    __pyx_t_4 = (__pyx_v_y + 1);\n    __pyx_t_5 = __pyx_v_t_x;\n    if (((__pyx_t_4 < __pyx_t_5) != 0)) {\n      __pyx_t_6 = __pyx_t_4;\n    } else {\n      __pyx_t_6 = __pyx_t_5;\n    }\n    __pyx_t_4 = __pyx_t_6;\n    __pyx_t_5 = ((__pyx_v_t_x + __pyx_v_y) - __pyx_v_t_y);\n    __pyx_t_6 = 0;\n    if (((__pyx_t_5 > __pyx_t_6) != 0)) {\n      __pyx_t_7 = __pyx_t_5;\n    } else {\n      __pyx_t_7 = __pyx_t_6;\n    }\n    __pyx_t_6 = __pyx_t_4;\n    for (__pyx_t_5 = __pyx_t_7; __pyx_t_5 < __pyx_t_6; __pyx_t_5+=1) {\n      __pyx_v_x = __pyx_t_5;\n\n      /* \"TTS/tts/utils/monotonic_align/core.pyx\":21\n *   for y in range(t_y):\n *     for x in range(max(0, t_x + y - t_y), min(t_x, y + 1)):\n *       if x == y:             # <<<<<<<<<<<<<<\n *         v_cur = max_neg_val\n *       else:\n */\n      __pyx_t_8 = ((__pyx_v_x == __pyx_v_y) != 0);\n      if (__pyx_t_8) {\n\n        /* \"TTS/tts/utils/monotonic_align/core.pyx\":22\n *     for x in range(max(0, t_x + y - t_y), min(t_x, y + 1)):\n *       if x == y:\n *         v_cur = max_neg_val             # <<<<<<<<<<<<<<\n *       else:\n *         v_cur = value[x, y-1]\n */\n        __pyx_v_v_cur = __pyx_v_max_neg_val;\n\n        /* \"TTS/tts/utils/monotonic_align/core.pyx\":21\n *   for y in range(t_y):\n *     for x in range(max(0, t_x + y - t_y), min(t_x, y + 1)):\n *       if x == y:             # <<<<<<<<<<<<<<\n *         v_cur = max_neg_val\n *       else:\n */\n        goto __pyx_L7;\n      }\n\n      /* \"TTS/tts/utils/monotonic_align/core.pyx\":24\n *         v_cur = max_neg_val\n *       else:\n *         v_cur = value[x, y-1]             # <<<<<<<<<<<<<<\n *       if x == 0:\n *         if y == 0:\n */\n      /*else*/ {\n        __pyx_t_9 = __pyx_v_x;\n        __pyx_t_10 = (__pyx_v_y - 1);\n        __pyx_v_v_cur = (*((float *) ( /* dim=1 */ ((char *) (((float *) ( /* dim=0 */ (__pyx_v_value.data + __pyx_t_9 * __pyx_v_value.strides[0]) )) + __pyx_t_10)) )));\n      }\n      __pyx_L7:;\n\n      /* \"TTS/tts/utils/monotonic_align/core.pyx\":25\n *       else:\n *         v_cur = value[x, y-1]\n *       if x == 0:             # <<<<<<<<<<<<<<\n *         if y == 0:\n *           v_prev = 0.\n */\n      __pyx_t_8 = ((__pyx_v_x == 0) != 0);\n      if (__pyx_t_8) {\n\n        /* \"TTS/tts/utils/monotonic_align/core.pyx\":26\n *         v_cur = value[x, y-1]\n *       if x == 0:\n *         if y == 0:             # <<<<<<<<<<<<<<\n *           v_prev = 0.\n *         else:\n */\n        __pyx_t_8 = ((__pyx_v_y == 0) != 0);\n        if (__pyx_t_8) {\n\n          /* \"TTS/tts/utils/monotonic_align/core.pyx\":27\n *       if x == 0:\n *         if y == 0:\n *           v_prev = 0.             # <<<<<<<<<<<<<<\n *         else:\n *           v_prev = max_neg_val\n */\n          __pyx_v_v_prev = 0.;\n\n          /* \"TTS/tts/utils/monotonic_align/core.pyx\":26\n *         v_cur = value[x, y-1]\n *       if x == 0:\n *         if y == 0:             # <<<<<<<<<<<<<<\n *           v_prev = 0.\n *         else:\n */\n          goto __pyx_L9;\n        }\n\n        /* \"TTS/tts/utils/monotonic_align/core.pyx\":29\n *           v_prev = 0.\n *         else:\n *           v_prev = max_neg_val             # <<<<<<<<<<<<<<\n *       else:\n *         v_prev = value[x-1, y-1]\n */\n        /*else*/ {\n          __pyx_v_v_prev = __pyx_v_max_neg_val;\n        }\n        __pyx_L9:;\n\n        /* \"TTS/tts/utils/monotonic_align/core.pyx\":25\n *       else:\n *         v_cur = value[x, y-1]\n *       if x == 0:             # <<<<<<<<<<<<<<\n *         if y == 0:\n *           v_prev = 0.\n */\n        goto __pyx_L8;\n      }\n\n      /* \"TTS/tts/utils/monotonic_align/core.pyx\":31\n *           v_prev = max_neg_val\n *       else:\n *         v_prev = value[x-1, y-1]             # <<<<<<<<<<<<<<\n *       value[x, y] = max(v_cur, v_prev) + value[x, y]\n * \n */\n      /*else*/ {\n        __pyx_t_10 = (__pyx_v_x - 1);\n        __pyx_t_9 = (__pyx_v_y - 1);\n        __pyx_v_v_prev = (*((float *) ( /* dim=1 */ ((char *) (((float *) ( /* dim=0 */ (__pyx_v_value.data + __pyx_t_10 * __pyx_v_value.strides[0]) )) + __pyx_t_9)) )));\n      }\n      __pyx_L8:;\n\n      /* \"TTS/tts/utils/monotonic_align/core.pyx\":32\n *       else:\n *         v_prev = value[x-1, y-1]\n *       value[x, y] = max(v_cur, v_prev) + value[x, y]             # <<<<<<<<<<<<<<\n * \n *   for y in range(t_y - 1, -1, -1):\n */\n      __pyx_t_11 = __pyx_v_v_prev;\n      __pyx_t_12 = __pyx_v_v_cur;\n      if (((__pyx_t_11 > __pyx_t_12) != 0)) {\n        __pyx_t_13 = __pyx_t_11;\n      } else {\n        __pyx_t_13 = __pyx_t_12;\n      }\n      __pyx_t_9 = __pyx_v_x;\n      __pyx_t_10 = __pyx_v_y;\n      __pyx_t_14 = __pyx_v_x;\n      __pyx_t_15 = __pyx_v_y;\n      *((float *) ( /* dim=1 */ ((char *) (((float *) ( /* dim=0 */ (__pyx_v_value.data + __pyx_t_14 * __pyx_v_value.strides[0]) )) + __pyx_t_15)) )) = (__pyx_t_13 + (*((float *) ( /* dim=1 */ ((char *) (((float *) ( /* dim=0 */ (__pyx_v_value.data + __pyx_t_9 * __pyx_v_value.strides[0]) )) + __pyx_t_10)) ))));\n    }\n  }\n\n  /* \"TTS/tts/utils/monotonic_align/core.pyx\":34\n *       value[x, y] = max(v_cur, v_prev) + value[x, y]\n * \n *   for y in range(t_y - 1, -1, -1):             # <<<<<<<<<<<<<<\n *     path[index, y] = 1\n *     if index != 0 and (index == y or value[index, y-1] < value[index-1, y-1]):\n */\n  for (__pyx_t_1 = (__pyx_v_t_y - 1); __pyx_t_1 > -1; __pyx_t_1-=1) {\n    __pyx_v_y = __pyx_t_1;\n\n    /* \"TTS/tts/utils/monotonic_align/core.pyx\":35\n * \n *   for y in range(t_y - 1, -1, -1):\n *     path[index, y] = 1             # <<<<<<<<<<<<<<\n *     if index != 0 and (index == y or value[index, y-1] < value[index-1, y-1]):\n *       index = index - 1\n */\n    __pyx_t_10 = __pyx_v_index;\n    __pyx_t_9 = __pyx_v_y;\n    *((int *) ( /* dim=1 */ ((char *) (((int *) ( /* dim=0 */ (__pyx_v_path.data + __pyx_t_10 * __pyx_v_path.strides[0]) )) + __pyx_t_9)) )) = 1;\n\n    /* \"TTS/tts/utils/monotonic_align/core.pyx\":36\n *   for y in range(t_y - 1, -1, -1):\n *     path[index, y] = 1\n *     if index != 0 and (index == y or value[index, y-1] < value[index-1, y-1]):             # <<<<<<<<<<<<<<\n *       index = index - 1\n * \n */\n    __pyx_t_16 = ((__pyx_v_index != 0) != 0);\n    if (__pyx_t_16) {\n    } else {\n      __pyx_t_8 = __pyx_t_16;\n      goto __pyx_L13_bool_binop_done;\n    }\n    __pyx_t_16 = ((__pyx_v_index == __pyx_v_y) != 0);\n    if (!__pyx_t_16) {\n    } else {\n      __pyx_t_8 = __pyx_t_16;\n      goto __pyx_L13_bool_binop_done;\n    }\n    __pyx_t_9 = __pyx_v_index;\n    __pyx_t_10 = (__pyx_v_y - 1);\n    __pyx_t_15 = (__pyx_v_index - 1);\n    __pyx_t_14 = (__pyx_v_y - 1);\n    __pyx_t_16 = (((*((float *) ( /* dim=1 */ ((char *) (((float *) ( /* dim=0 */ (__pyx_v_value.data + __pyx_t_9 * __pyx_v_value.strides[0]) )) + __pyx_t_10)) ))) < (*((float *) ( /* dim=1 */ ((char *) (((float *) ( /* dim=0 */ (__pyx_v_value.data + __pyx_t_15 * __pyx_v_value.strides[0]) )) + __pyx_t_14)) )))) != 0);\n    __pyx_t_8 = __pyx_t_16;\n    __pyx_L13_bool_binop_done:;\n    if (__pyx_t_8) {\n\n      /* \"TTS/tts/utils/monotonic_align/core.pyx\":37\n *     path[index, y] = 1\n *     if index != 0 and (index == y or value[index, y-1] < value[index-1, y-1]):\n *       index = index - 1             # <<<<<<<<<<<<<<\n * \n * \n */\n      __pyx_v_index = (__pyx_v_index - 1);\n\n      /* \"TTS/tts/utils/monotonic_align/core.pyx\":36\n *   for y in range(t_y - 1, -1, -1):\n *     path[index, y] = 1\n *     if index != 0 and (index == y or value[index, y-1] < value[index-1, y-1]):             # <<<<<<<<<<<<<<\n *       index = index - 1\n * \n */\n    }\n  }\n\n  /* \"TTS/tts/utils/monotonic_align/core.pyx\":11\n * @cython.boundscheck(False)\n * @cython.wraparound(False)\n * cdef void maximum_path_each(int[:,::1] path, float[:,::1] value, int t_x, int t_y, float max_neg_val) nogil:             # <<<<<<<<<<<<<<\n *   cdef int x\n *   cdef int y\n */\n\n  /* function exit code */\n}\n\n/* \"TTS/tts/utils/monotonic_align/core.pyx\":42\n * @cython.boundscheck(False)\n * @cython.wraparound(False)\n * cpdef void maximum_path_c(int[:,:,::1] paths, float[:,:,::1] values, int[::1] t_xs, int[::1] t_ys, float max_neg_val=-1e9) nogil:             # <<<<<<<<<<<<<<\n *   cdef int b = values.shape[0]\n * \n */\n\nstatic PyObject *__pyx_pw_3TTS_3tts_5utils_15monotonic_align_4core_1maximum_path_c(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/\nstatic void __pyx_f_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_c(__Pyx_memviewslice __pyx_v_paths, __Pyx_memviewslice __pyx_v_values, __Pyx_memviewslice __pyx_v_t_xs, __Pyx_memviewslice __pyx_v_t_ys, CYTHON_UNUSED int __pyx_skip_dispatch, struct __pyx_opt_args_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_c *__pyx_optional_args) {\n  float __pyx_v_max_neg_val = __pyx_k_;\n  CYTHON_UNUSED int __pyx_v_b;\n  int __pyx_v_i;\n  int __pyx_t_1;\n  int __pyx_t_2;\n  int __pyx_t_3;\n  __Pyx_memviewslice __pyx_t_4 = { 0, 0, { 0 }, { 0 }, { 0 } };\n  __Pyx_memviewslice __pyx_t_5 = { 0, 0, { 0 }, { 0 }, { 0 } };\n  Py_ssize_t __pyx_t_6;\n  Py_ssize_t __pyx_t_7;\n  if (__pyx_optional_args) {\n    if (__pyx_optional_args->__pyx_n > 0) {\n      __pyx_v_max_neg_val = __pyx_optional_args->max_neg_val;\n    }\n  }\n\n  /* \"TTS/tts/utils/monotonic_align/core.pyx\":43\n * @cython.wraparound(False)\n * cpdef void maximum_path_c(int[:,:,::1] paths, float[:,:,::1] values, int[::1] t_xs, int[::1] t_ys, float max_neg_val=-1e9) nogil:\n *   cdef int b = values.shape[0]             # <<<<<<<<<<<<<<\n * \n *   cdef int i\n */\n  __pyx_v_b = (__pyx_v_values.shape[0]);\n\n  /* \"TTS/tts/utils/monotonic_align/core.pyx\":46\n * \n *   cdef int i\n *   for i in prange(b, nogil=True):             # <<<<<<<<<<<<<<\n *     maximum_path_each(paths[i], values[i], t_xs[i], t_ys[i], max_neg_val)\n */\n  {\n      #ifdef WITH_THREAD\n      PyThreadState *_save;\n      Py_UNBLOCK_THREADS\n      __Pyx_FastGIL_Remember();\n      #endif\n      /*try:*/ {\n        __pyx_t_1 = __pyx_v_b;\n        if ((1 == 0)) abort();\n        {\n            #if ((defined(__APPLE__) || defined(__OSX__)) && (defined(__GNUC__) && (__GNUC__ > 2 || (__GNUC__ == 2 && (__GNUC_MINOR__ > 95)))))\n                #undef likely\n                #undef unlikely\n                #define likely(x)   (x)\n                #define unlikely(x) (x)\n            #endif\n            __pyx_t_3 = (__pyx_t_1 - 0 + 1 - 1/abs(1)) / 1;\n            if (__pyx_t_3 > 0)\n            {\n                #ifdef _OPENMP\n                #pragma omp parallel private(__pyx_t_6, __pyx_t_7) firstprivate(__pyx_t_4, __pyx_t_5)\n                #endif /* _OPENMP */\n                {\n                    #ifdef _OPENMP\n                    #pragma omp for firstprivate(__pyx_v_i) lastprivate(__pyx_v_i)\n                    #endif /* _OPENMP */\n                    for (__pyx_t_2 = 0; __pyx_t_2 < __pyx_t_3; __pyx_t_2++){\n                        {\n                            __pyx_v_i = (int)(0 + 1 * __pyx_t_2);\n\n                            /* \"TTS/tts/utils/monotonic_align/core.pyx\":47\n *   cdef int i\n *   for i in prange(b, nogil=True):\n *     maximum_path_each(paths[i], values[i], t_xs[i], t_ys[i], max_neg_val)             # <<<<<<<<<<<<<<\n */\n                            __pyx_t_4.data = __pyx_v_paths.data;\n                            __pyx_t_4.memview = __pyx_v_paths.memview;\n                            __PYX_INC_MEMVIEW(&__pyx_t_4, 0);\n                            {\n    Py_ssize_t __pyx_tmp_idx = __pyx_v_i;\n    Py_ssize_t __pyx_tmp_stride = __pyx_v_paths.strides[0];\n        __pyx_t_4.data += __pyx_tmp_idx * __pyx_tmp_stride;\n}\n\n__pyx_t_4.shape[0] = __pyx_v_paths.shape[1];\n__pyx_t_4.strides[0] = __pyx_v_paths.strides[1];\n    __pyx_t_4.suboffsets[0] = -1;\n\n__pyx_t_4.shape[1] = __pyx_v_paths.shape[2];\n__pyx_t_4.strides[1] = __pyx_v_paths.strides[2];\n    __pyx_t_4.suboffsets[1] = -1;\n\n__pyx_t_5.data = __pyx_v_values.data;\n                            __pyx_t_5.memview = __pyx_v_values.memview;\n                            __PYX_INC_MEMVIEW(&__pyx_t_5, 0);\n                            {\n    Py_ssize_t __pyx_tmp_idx = __pyx_v_i;\n    Py_ssize_t __pyx_tmp_stride = __pyx_v_values.strides[0];\n        __pyx_t_5.data += __pyx_tmp_idx * __pyx_tmp_stride;\n}\n\n__pyx_t_5.shape[0] = __pyx_v_values.shape[1];\n__pyx_t_5.strides[0] = __pyx_v_values.strides[1];\n    __pyx_t_5.suboffsets[0] = -1;\n\n__pyx_t_5.shape[1] = __pyx_v_values.shape[2];\n__pyx_t_5.strides[1] = __pyx_v_values.strides[2];\n    __pyx_t_5.suboffsets[1] = -1;\n\n__pyx_t_6 = __pyx_v_i;\n                            __pyx_t_7 = __pyx_v_i;\n                            __pyx_f_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_each(__pyx_t_4, __pyx_t_5, (*((int *) ( /* dim=0 */ ((char *) (((int *) __pyx_v_t_xs.data) + __pyx_t_6)) ))), (*((int *) ( /* dim=0 */ ((char *) (((int *) __pyx_v_t_ys.data) + __pyx_t_7)) ))), __pyx_v_max_neg_val);\n                            __PYX_XDEC_MEMVIEW(&__pyx_t_4, 0);\n                            __pyx_t_4.memview = NULL;\n                            __pyx_t_4.data = NULL;\n                            __PYX_XDEC_MEMVIEW(&__pyx_t_5, 0);\n                            __pyx_t_5.memview = NULL;\n                            __pyx_t_5.data = NULL;\n                        }\n                    }\n                }\n            }\n        }\n        #if ((defined(__APPLE__) || defined(__OSX__)) && (defined(__GNUC__) && (__GNUC__ > 2 || (__GNUC__ == 2 && (__GNUC_MINOR__ > 95)))))\n            #undef likely\n            #undef unlikely\n            #define likely(x)   __builtin_expect(!!(x), 1)\n            #define unlikely(x) __builtin_expect(!!(x), 0)\n        #endif\n      }\n\n      /* \"TTS/tts/utils/monotonic_align/core.pyx\":46\n * \n *   cdef int i\n *   for i in prange(b, nogil=True):             # <<<<<<<<<<<<<<\n *     maximum_path_each(paths[i], values[i], t_xs[i], t_ys[i], max_neg_val)\n */\n      /*finally:*/ {\n        /*normal exit:*/{\n          #ifdef WITH_THREAD\n          __Pyx_FastGIL_Forget();\n          Py_BLOCK_THREADS\n          #endif\n          goto __pyx_L5;\n        }\n        __pyx_L5:;\n      }\n  }\n\n  /* \"TTS/tts/utils/monotonic_align/core.pyx\":42\n * @cython.boundscheck(False)\n * @cython.wraparound(False)\n * cpdef void maximum_path_c(int[:,:,::1] paths, float[:,:,::1] values, int[::1] t_xs, int[::1] t_ys, float max_neg_val=-1e9) nogil:             # <<<<<<<<<<<<<<\n *   cdef int b = values.shape[0]\n * \n */\n\n  /* function exit code */\n}\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw_3TTS_3tts_5utils_15monotonic_align_4core_1maximum_path_c(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/\nstatic PyObject *__pyx_pw_3TTS_3tts_5utils_15monotonic_align_4core_1maximum_path_c(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds) {\n  __Pyx_memviewslice __pyx_v_paths = { 0, 0, { 0 }, { 0 }, { 0 } };\n  __Pyx_memviewslice __pyx_v_values = { 0, 0, { 0 }, { 0 }, { 0 } };\n  __Pyx_memviewslice __pyx_v_t_xs = { 0, 0, { 0 }, { 0 }, { 0 } };\n  __Pyx_memviewslice __pyx_v_t_ys = { 0, 0, { 0 }, { 0 }, { 0 } };\n  float __pyx_v_max_neg_val;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"maximum_path_c (wrapper)\", 0);\n  {\n    static PyObject **__pyx_pyargnames[] = {&__pyx_n_s_paths,&__pyx_n_s_values,&__pyx_n_s_t_xs,&__pyx_n_s_t_ys,&__pyx_n_s_max_neg_val,0};\n    PyObject* values[5] = {0,0,0,0,0};\n    if (unlikely(__pyx_kwds)) {\n      Py_ssize_t kw_args;\n      const Py_ssize_t pos_args = PyTuple_GET_SIZE(__pyx_args);\n      switch (pos_args) {\n        case  5: values[4] = PyTuple_GET_ITEM(__pyx_args, 4);\n        CYTHON_FALLTHROUGH;\n        case  4: values[3] = PyTuple_GET_ITEM(__pyx_args, 3);\n        CYTHON_FALLTHROUGH;\n        case  3: values[2] = PyTuple_GET_ITEM(__pyx_args, 2);\n        CYTHON_FALLTHROUGH;\n        case  2: values[1] = PyTuple_GET_ITEM(__pyx_args, 1);\n        CYTHON_FALLTHROUGH;\n        case  1: values[0] = PyTuple_GET_ITEM(__pyx_args, 0);\n        CYTHON_FALLTHROUGH;\n        case  0: break;\n        default: goto __pyx_L5_argtuple_error;\n      }\n      kw_args = PyDict_Size(__pyx_kwds);\n      switch (pos_args) {\n        case  0:\n        if (likely((values[0] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_paths)) != 0)) kw_args--;\n        else goto __pyx_L5_argtuple_error;\n        CYTHON_FALLTHROUGH;\n        case  1:\n        if (likely((values[1] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_values)) != 0)) kw_args--;\n        else {\n          __Pyx_RaiseArgtupleInvalid(\"maximum_path_c\", 0, 4, 5, 1); __PYX_ERR(0, 42, __pyx_L3_error)\n        }\n        CYTHON_FALLTHROUGH;\n        case  2:\n        if (likely((values[2] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_t_xs)) != 0)) kw_args--;\n        else {\n          __Pyx_RaiseArgtupleInvalid(\"maximum_path_c\", 0, 4, 5, 2); __PYX_ERR(0, 42, __pyx_L3_error)\n        }\n        CYTHON_FALLTHROUGH;\n        case  3:\n        if (likely((values[3] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_t_ys)) != 0)) kw_args--;\n        else {\n          __Pyx_RaiseArgtupleInvalid(\"maximum_path_c\", 0, 4, 5, 3); __PYX_ERR(0, 42, __pyx_L3_error)\n        }\n        CYTHON_FALLTHROUGH;\n        case  4:\n        if (kw_args > 0) {\n          PyObject* value = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_max_neg_val);\n          if (value) { values[4] = value; kw_args--; }\n        }\n      }\n      if (unlikely(kw_args > 0)) {\n        if (unlikely(__Pyx_ParseOptionalKeywords(__pyx_kwds, __pyx_pyargnames, 0, values, pos_args, \"maximum_path_c\") < 0)) __PYX_ERR(0, 42, __pyx_L3_error)\n      }\n    } else {\n      switch (PyTuple_GET_SIZE(__pyx_args)) {\n        case  5: values[4] = PyTuple_GET_ITEM(__pyx_args, 4);\n        CYTHON_FALLTHROUGH;\n        case  4: values[3] = PyTuple_GET_ITEM(__pyx_args, 3);\n        values[2] = PyTuple_GET_ITEM(__pyx_args, 2);\n        values[1] = PyTuple_GET_ITEM(__pyx_args, 1);\n        values[0] = PyTuple_GET_ITEM(__pyx_args, 0);\n        break;\n        default: goto __pyx_L5_argtuple_error;\n      }\n    }\n    __pyx_v_paths = __Pyx_PyObject_to_MemoryviewSlice_d_d_dc_int(values[0], PyBUF_WRITABLE); if (unlikely(!__pyx_v_paths.memview)) __PYX_ERR(0, 42, __pyx_L3_error)\n    __pyx_v_values = __Pyx_PyObject_to_MemoryviewSlice_d_d_dc_float(values[1], PyBUF_WRITABLE); if (unlikely(!__pyx_v_values.memview)) __PYX_ERR(0, 42, __pyx_L3_error)\n    __pyx_v_t_xs = __Pyx_PyObject_to_MemoryviewSlice_dc_int(values[2], PyBUF_WRITABLE); if (unlikely(!__pyx_v_t_xs.memview)) __PYX_ERR(0, 42, __pyx_L3_error)\n    __pyx_v_t_ys = __Pyx_PyObject_to_MemoryviewSlice_dc_int(values[3], PyBUF_WRITABLE); if (unlikely(!__pyx_v_t_ys.memview)) __PYX_ERR(0, 42, __pyx_L3_error)\n    if (values[4]) {\n      __pyx_v_max_neg_val = __pyx_PyFloat_AsFloat(values[4]); if (unlikely((__pyx_v_max_neg_val == (float)-1) && PyErr_Occurred())) __PYX_ERR(0, 42, __pyx_L3_error)\n    } else {\n      __pyx_v_max_neg_val = __pyx_k_;\n    }\n  }\n  goto __pyx_L4_argument_unpacking_done;\n  __pyx_L5_argtuple_error:;\n  __Pyx_RaiseArgtupleInvalid(\"maximum_path_c\", 0, 4, 5, PyTuple_GET_SIZE(__pyx_args)); __PYX_ERR(0, 42, __pyx_L3_error)\n  __pyx_L3_error:;\n  __Pyx_AddTraceback(\"TTS.tts.utils.monotonic_align.core.maximum_path_c\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __Pyx_RefNannyFinishContext();\n  return NULL;\n  __pyx_L4_argument_unpacking_done:;\n  __pyx_r = __pyx_pf_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_c(__pyx_self, __pyx_v_paths, __pyx_v_values, __pyx_v_t_xs, __pyx_v_t_ys, __pyx_v_max_neg_val);\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_c(CYTHON_UNUSED PyObject *__pyx_self, __Pyx_memviewslice __pyx_v_paths, __Pyx_memviewslice __pyx_v_values, __Pyx_memviewslice __pyx_v_t_xs, __Pyx_memviewslice __pyx_v_t_ys, float __pyx_v_max_neg_val) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  struct __pyx_opt_args_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_c __pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"maximum_path_c\", 0);\n  __Pyx_XDECREF(__pyx_r);\n  if (unlikely(!__pyx_v_paths.memview)) { __Pyx_RaiseUnboundLocalError(\"paths\"); __PYX_ERR(0, 42, __pyx_L1_error) }\n  if (unlikely(!__pyx_v_values.memview)) { __Pyx_RaiseUnboundLocalError(\"values\"); __PYX_ERR(0, 42, __pyx_L1_error) }\n  if (unlikely(!__pyx_v_t_xs.memview)) { __Pyx_RaiseUnboundLocalError(\"t_xs\"); __PYX_ERR(0, 42, __pyx_L1_error) }\n  if (unlikely(!__pyx_v_t_ys.memview)) { __Pyx_RaiseUnboundLocalError(\"t_ys\"); __PYX_ERR(0, 42, __pyx_L1_error) }\n  __pyx_t_1.__pyx_n = 1;\n  __pyx_t_1.max_neg_val = __pyx_v_max_neg_val;\n  __pyx_f_3TTS_3tts_5utils_15monotonic_align_4core_maximum_path_c(__pyx_v_paths, __pyx_v_values, __pyx_v_t_xs, __pyx_v_t_ys, 0, &__pyx_t_1); \n  __pyx_t_2 = __Pyx_void_to_None(NULL); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 42, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __pyx_r = __pyx_t_2;\n  __pyx_t_2 = 0;\n  goto __pyx_L0;\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_AddTraceback(\"TTS.tts.utils.monotonic_align.core.maximum_path_c\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __PYX_XDEC_MEMVIEW(&__pyx_v_paths, 1);\n  __PYX_XDEC_MEMVIEW(&__pyx_v_values, 1);\n  __PYX_XDEC_MEMVIEW(&__pyx_v_t_xs, 1);\n  __PYX_XDEC_MEMVIEW(&__pyx_v_t_ys, 1);\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":735\n * ctypedef npy_cdouble     complex_t\n * \n * cdef inline object PyArray_MultiIterNew1(a):             # <<<<<<<<<<<<<<\n *     return PyArray_MultiIterNew(1, <void*>a)\n * \n */\n\nstatic CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew1(PyObject *__pyx_v_a) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"PyArray_MultiIterNew1\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":736\n * \n * cdef inline object PyArray_MultiIterNew1(a):\n *     return PyArray_MultiIterNew(1, <void*>a)             # <<<<<<<<<<<<<<\n * \n * cdef inline object PyArray_MultiIterNew2(a, b):\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = PyArray_MultiIterNew(1, ((void *)__pyx_v_a)); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 736, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_r = __pyx_t_1;\n  __pyx_t_1 = 0;\n  goto __pyx_L0;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":735\n * ctypedef npy_cdouble     complex_t\n * \n * cdef inline object PyArray_MultiIterNew1(a):             # <<<<<<<<<<<<<<\n *     return PyArray_MultiIterNew(1, <void*>a)\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"numpy.PyArray_MultiIterNew1\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":738\n *     return PyArray_MultiIterNew(1, <void*>a)\n * \n * cdef inline object PyArray_MultiIterNew2(a, b):             # <<<<<<<<<<<<<<\n *     return PyArray_MultiIterNew(2, <void*>a, <void*>b)\n * \n */\n\nstatic CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew2(PyObject *__pyx_v_a, PyObject *__pyx_v_b) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"PyArray_MultiIterNew2\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":739\n * \n * cdef inline object PyArray_MultiIterNew2(a, b):\n *     return PyArray_MultiIterNew(2, <void*>a, <void*>b)             # <<<<<<<<<<<<<<\n * \n * cdef inline object PyArray_MultiIterNew3(a, b, c):\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = PyArray_MultiIterNew(2, ((void *)__pyx_v_a), ((void *)__pyx_v_b)); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 739, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_r = __pyx_t_1;\n  __pyx_t_1 = 0;\n  goto __pyx_L0;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":738\n *     return PyArray_MultiIterNew(1, <void*>a)\n * \n * cdef inline object PyArray_MultiIterNew2(a, b):             # <<<<<<<<<<<<<<\n *     return PyArray_MultiIterNew(2, <void*>a, <void*>b)\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"numpy.PyArray_MultiIterNew2\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":741\n *     return PyArray_MultiIterNew(2, <void*>a, <void*>b)\n * \n * cdef inline object PyArray_MultiIterNew3(a, b, c):             # <<<<<<<<<<<<<<\n *     return PyArray_MultiIterNew(3, <void*>a, <void*>b, <void*> c)\n * \n */\n\nstatic CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew3(PyObject *__pyx_v_a, PyObject *__pyx_v_b, PyObject *__pyx_v_c) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"PyArray_MultiIterNew3\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":742\n * \n * cdef inline object PyArray_MultiIterNew3(a, b, c):\n *     return PyArray_MultiIterNew(3, <void*>a, <void*>b, <void*> c)             # <<<<<<<<<<<<<<\n * \n * cdef inline object PyArray_MultiIterNew4(a, b, c, d):\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = PyArray_MultiIterNew(3, ((void *)__pyx_v_a), ((void *)__pyx_v_b), ((void *)__pyx_v_c)); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 742, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_r = __pyx_t_1;\n  __pyx_t_1 = 0;\n  goto __pyx_L0;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":741\n *     return PyArray_MultiIterNew(2, <void*>a, <void*>b)\n * \n * cdef inline object PyArray_MultiIterNew3(a, b, c):             # <<<<<<<<<<<<<<\n *     return PyArray_MultiIterNew(3, <void*>a, <void*>b, <void*> c)\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"numpy.PyArray_MultiIterNew3\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":744\n *     return PyArray_MultiIterNew(3, <void*>a, <void*>b, <void*> c)\n * \n * cdef inline object PyArray_MultiIterNew4(a, b, c, d):             # <<<<<<<<<<<<<<\n *     return PyArray_MultiIterNew(4, <void*>a, <void*>b, <void*>c, <void*> d)\n * \n */\n\nstatic CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew4(PyObject *__pyx_v_a, PyObject *__pyx_v_b, PyObject *__pyx_v_c, PyObject *__pyx_v_d) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"PyArray_MultiIterNew4\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":745\n * \n * cdef inline object PyArray_MultiIterNew4(a, b, c, d):\n *     return PyArray_MultiIterNew(4, <void*>a, <void*>b, <void*>c, <void*> d)             # <<<<<<<<<<<<<<\n * \n * cdef inline object PyArray_MultiIterNew5(a, b, c, d, e):\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = PyArray_MultiIterNew(4, ((void *)__pyx_v_a), ((void *)__pyx_v_b), ((void *)__pyx_v_c), ((void *)__pyx_v_d)); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 745, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_r = __pyx_t_1;\n  __pyx_t_1 = 0;\n  goto __pyx_L0;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":744\n *     return PyArray_MultiIterNew(3, <void*>a, <void*>b, <void*> c)\n * \n * cdef inline object PyArray_MultiIterNew4(a, b, c, d):             # <<<<<<<<<<<<<<\n *     return PyArray_MultiIterNew(4, <void*>a, <void*>b, <void*>c, <void*> d)\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"numpy.PyArray_MultiIterNew4\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":747\n *     return PyArray_MultiIterNew(4, <void*>a, <void*>b, <void*>c, <void*> d)\n * \n * cdef inline object PyArray_MultiIterNew5(a, b, c, d, e):             # <<<<<<<<<<<<<<\n *     return PyArray_MultiIterNew(5, <void*>a, <void*>b, <void*>c, <void*> d, <void*> e)\n * \n */\n\nstatic CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew5(PyObject *__pyx_v_a, PyObject *__pyx_v_b, PyObject *__pyx_v_c, PyObject *__pyx_v_d, PyObject *__pyx_v_e) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"PyArray_MultiIterNew5\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":748\n * \n * cdef inline object PyArray_MultiIterNew5(a, b, c, d, e):\n *     return PyArray_MultiIterNew(5, <void*>a, <void*>b, <void*>c, <void*> d, <void*> e)             # <<<<<<<<<<<<<<\n * \n * cdef inline tuple PyDataType_SHAPE(dtype d):\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = PyArray_MultiIterNew(5, ((void *)__pyx_v_a), ((void *)__pyx_v_b), ((void *)__pyx_v_c), ((void *)__pyx_v_d), ((void *)__pyx_v_e)); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 748, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_r = __pyx_t_1;\n  __pyx_t_1 = 0;\n  goto __pyx_L0;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":747\n *     return PyArray_MultiIterNew(4, <void*>a, <void*>b, <void*>c, <void*> d)\n * \n * cdef inline object PyArray_MultiIterNew5(a, b, c, d, e):             # <<<<<<<<<<<<<<\n *     return PyArray_MultiIterNew(5, <void*>a, <void*>b, <void*>c, <void*> d, <void*> e)\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"numpy.PyArray_MultiIterNew5\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":750\n *     return PyArray_MultiIterNew(5, <void*>a, <void*>b, <void*>c, <void*> d, <void*> e)\n * \n * cdef inline tuple PyDataType_SHAPE(dtype d):             # <<<<<<<<<<<<<<\n *     if PyDataType_HASSUBARRAY(d):\n *         return <tuple>d.subarray.shape\n */\n\nstatic CYTHON_INLINE PyObject *__pyx_f_5numpy_PyDataType_SHAPE(PyArray_Descr *__pyx_v_d) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  __Pyx_RefNannySetupContext(\"PyDataType_SHAPE\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":751\n * \n * cdef inline tuple PyDataType_SHAPE(dtype d):\n *     if PyDataType_HASSUBARRAY(d):             # <<<<<<<<<<<<<<\n *         return <tuple>d.subarray.shape\n *     else:\n */\n  __pyx_t_1 = (PyDataType_HASSUBARRAY(__pyx_v_d) != 0);\n  if (__pyx_t_1) {\n\n    /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":752\n * cdef inline tuple PyDataType_SHAPE(dtype d):\n *     if PyDataType_HASSUBARRAY(d):\n *         return <tuple>d.subarray.shape             # <<<<<<<<<<<<<<\n *     else:\n *         return ()\n */\n    __Pyx_XDECREF(__pyx_r);\n    __Pyx_INCREF(((PyObject*)__pyx_v_d->subarray->shape));\n    __pyx_r = ((PyObject*)__pyx_v_d->subarray->shape);\n    goto __pyx_L0;\n\n    /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":751\n * \n * cdef inline tuple PyDataType_SHAPE(dtype d):\n *     if PyDataType_HASSUBARRAY(d):             # <<<<<<<<<<<<<<\n *         return <tuple>d.subarray.shape\n *     else:\n */\n  }\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":754\n *         return <tuple>d.subarray.shape\n *     else:\n *         return ()             # <<<<<<<<<<<<<<\n * \n * \n */\n  /*else*/ {\n    __Pyx_XDECREF(__pyx_r);\n    __Pyx_INCREF(__pyx_empty_tuple);\n    __pyx_r = __pyx_empty_tuple;\n    goto __pyx_L0;\n  }\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":750\n *     return PyArray_MultiIterNew(5, <void*>a, <void*>b, <void*>c, <void*> d, <void*> e)\n * \n * cdef inline tuple PyDataType_SHAPE(dtype d):             # <<<<<<<<<<<<<<\n *     if PyDataType_HASSUBARRAY(d):\n *         return <tuple>d.subarray.shape\n */\n\n  /* function exit code */\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":929\n *     int _import_umath() except -1\n * \n * cdef inline void set_array_base(ndarray arr, object base):             # <<<<<<<<<<<<<<\n *     Py_INCREF(base) # important to do this before stealing the reference below!\n *     PyArray_SetBaseObject(arr, base)\n */\n\nstatic CYTHON_INLINE void __pyx_f_5numpy_set_array_base(PyArrayObject *__pyx_v_arr, PyObject *__pyx_v_base) {\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"set_array_base\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":930\n * \n * cdef inline void set_array_base(ndarray arr, object base):\n *     Py_INCREF(base) # important to do this before stealing the reference below!             # <<<<<<<<<<<<<<\n *     PyArray_SetBaseObject(arr, base)\n * \n */\n  Py_INCREF(__pyx_v_base);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":931\n * cdef inline void set_array_base(ndarray arr, object base):\n *     Py_INCREF(base) # important to do this before stealing the reference below!\n *     PyArray_SetBaseObject(arr, base)             # <<<<<<<<<<<<<<\n * \n * cdef inline object get_array_base(ndarray arr):\n */\n  (void)(PyArray_SetBaseObject(__pyx_v_arr, __pyx_v_base));\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":929\n *     int _import_umath() except -1\n * \n * cdef inline void set_array_base(ndarray arr, object base):             # <<<<<<<<<<<<<<\n *     Py_INCREF(base) # important to do this before stealing the reference below!\n *     PyArray_SetBaseObject(arr, base)\n */\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":933\n *     PyArray_SetBaseObject(arr, base)\n * \n * cdef inline object get_array_base(ndarray arr):             # <<<<<<<<<<<<<<\n *     base = PyArray_BASE(arr)\n *     if base is NULL:\n */\n\nstatic CYTHON_INLINE PyObject *__pyx_f_5numpy_get_array_base(PyArrayObject *__pyx_v_arr) {\n  PyObject *__pyx_v_base;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  __Pyx_RefNannySetupContext(\"get_array_base\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":934\n * \n * cdef inline object get_array_base(ndarray arr):\n *     base = PyArray_BASE(arr)             # <<<<<<<<<<<<<<\n *     if base is NULL:\n *         return None\n */\n  __pyx_v_base = PyArray_BASE(__pyx_v_arr);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":935\n * cdef inline object get_array_base(ndarray arr):\n *     base = PyArray_BASE(arr)\n *     if base is NULL:             # <<<<<<<<<<<<<<\n *         return None\n *     return <object>base\n */\n  __pyx_t_1 = ((__pyx_v_base == NULL) != 0);\n  if (__pyx_t_1) {\n\n    /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":936\n *     base = PyArray_BASE(arr)\n *     if base is NULL:\n *         return None             # <<<<<<<<<<<<<<\n *     return <object>base\n * \n */\n    __Pyx_XDECREF(__pyx_r);\n    __pyx_r = Py_None; __Pyx_INCREF(Py_None);\n    goto __pyx_L0;\n\n    /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":935\n * cdef inline object get_array_base(ndarray arr):\n *     base = PyArray_BASE(arr)\n *     if base is NULL:             # <<<<<<<<<<<<<<\n *         return None\n *     return <object>base\n */\n  }\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":937\n *     if base is NULL:\n *         return None\n *     return <object>base             # <<<<<<<<<<<<<<\n * \n * # Versions of the import_* functions which are more suitable for\n */\n  __Pyx_XDECREF(__pyx_r);\n  __Pyx_INCREF(((PyObject *)__pyx_v_base));\n  __pyx_r = ((PyObject *)__pyx_v_base);\n  goto __pyx_L0;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":933\n *     PyArray_SetBaseObject(arr, base)\n * \n * cdef inline object get_array_base(ndarray arr):             # <<<<<<<<<<<<<<\n *     base = PyArray_BASE(arr)\n *     if base is NULL:\n */\n\n  /* function exit code */\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":941\n * # Versions of the import_* functions which are more suitable for\n * # Cython code.\n * cdef inline int import_array() except -1:             # <<<<<<<<<<<<<<\n *     try:\n *         __pyx_import_array()\n */\n\nstatic CYTHON_INLINE int __pyx_f_5numpy_import_array(void) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  int __pyx_t_4;\n  PyObject *__pyx_t_5 = NULL;\n  PyObject *__pyx_t_6 = NULL;\n  PyObject *__pyx_t_7 = NULL;\n  PyObject *__pyx_t_8 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"import_array\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":942\n * # Cython code.\n * cdef inline int import_array() except -1:\n *     try:             # <<<<<<<<<<<<<<\n *         __pyx_import_array()\n *     except Exception:\n */\n  {\n    __Pyx_PyThreadState_declare\n    __Pyx_PyThreadState_assign\n    __Pyx_ExceptionSave(&__pyx_t_1, &__pyx_t_2, &__pyx_t_3);\n    __Pyx_XGOTREF(__pyx_t_1);\n    __Pyx_XGOTREF(__pyx_t_2);\n    __Pyx_XGOTREF(__pyx_t_3);\n    /*try:*/ {\n\n      /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":943\n * cdef inline int import_array() except -1:\n *     try:\n *         __pyx_import_array()             # <<<<<<<<<<<<<<\n *     except Exception:\n *         raise ImportError(\"numpy.core.multiarray failed to import\")\n */\n      __pyx_t_4 = _import_array(); if (unlikely(__pyx_t_4 == ((int)-1))) __PYX_ERR(1, 943, __pyx_L3_error)\n\n      /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":942\n * # Cython code.\n * cdef inline int import_array() except -1:\n *     try:             # <<<<<<<<<<<<<<\n *         __pyx_import_array()\n *     except Exception:\n */\n    }\n    __Pyx_XDECREF(__pyx_t_1); __pyx_t_1 = 0;\n    __Pyx_XDECREF(__pyx_t_2); __pyx_t_2 = 0;\n    __Pyx_XDECREF(__pyx_t_3); __pyx_t_3 = 0;\n    goto __pyx_L8_try_end;\n    __pyx_L3_error:;\n\n    /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":944\n *     try:\n *         __pyx_import_array()\n *     except Exception:             # <<<<<<<<<<<<<<\n *         raise ImportError(\"numpy.core.multiarray failed to import\")\n * \n */\n    __pyx_t_4 = __Pyx_PyErr_ExceptionMatches(((PyObject *)(&((PyTypeObject*)PyExc_Exception)[0])));\n    if (__pyx_t_4) {\n      __Pyx_AddTraceback(\"numpy.import_array\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n      if (__Pyx_GetException(&__pyx_t_5, &__pyx_t_6, &__pyx_t_7) < 0) __PYX_ERR(1, 944, __pyx_L5_except_error)\n      __Pyx_GOTREF(__pyx_t_5);\n      __Pyx_GOTREF(__pyx_t_6);\n      __Pyx_GOTREF(__pyx_t_7);\n\n      /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":945\n *         __pyx_import_array()\n *     except Exception:\n *         raise ImportError(\"numpy.core.multiarray failed to import\")             # <<<<<<<<<<<<<<\n * \n * cdef inline int import_umath() except -1:\n */\n      __pyx_t_8 = __Pyx_PyObject_Call(__pyx_builtin_ImportError, __pyx_tuple__2, NULL); if (unlikely(!__pyx_t_8)) __PYX_ERR(1, 945, __pyx_L5_except_error)\n      __Pyx_GOTREF(__pyx_t_8);\n      __Pyx_Raise(__pyx_t_8, 0, 0, 0);\n      __Pyx_DECREF(__pyx_t_8); __pyx_t_8 = 0;\n      __PYX_ERR(1, 945, __pyx_L5_except_error)\n    }\n    goto __pyx_L5_except_error;\n    __pyx_L5_except_error:;\n\n    /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":942\n * # Cython code.\n * cdef inline int import_array() except -1:\n *     try:             # <<<<<<<<<<<<<<\n *         __pyx_import_array()\n *     except Exception:\n */\n    __Pyx_XGIVEREF(__pyx_t_1);\n    __Pyx_XGIVEREF(__pyx_t_2);\n    __Pyx_XGIVEREF(__pyx_t_3);\n    __Pyx_ExceptionReset(__pyx_t_1, __pyx_t_2, __pyx_t_3);\n    goto __pyx_L1_error;\n    __pyx_L8_try_end:;\n  }\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":941\n * # Versions of the import_* functions which are more suitable for\n * # Cython code.\n * cdef inline int import_array() except -1:             # <<<<<<<<<<<<<<\n *     try:\n *         __pyx_import_array()\n */\n\n  /* function exit code */\n  __pyx_r = 0;\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_XDECREF(__pyx_t_6);\n  __Pyx_XDECREF(__pyx_t_7);\n  __Pyx_XDECREF(__pyx_t_8);\n  __Pyx_AddTraceback(\"numpy.import_array\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = -1;\n  __pyx_L0:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":947\n *         raise ImportError(\"numpy.core.multiarray failed to import\")\n * \n * cdef inline int import_umath() except -1:             # <<<<<<<<<<<<<<\n *     try:\n *         _import_umath()\n */\n\nstatic CYTHON_INLINE int __pyx_f_5numpy_import_umath(void) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  int __pyx_t_4;\n  PyObject *__pyx_t_5 = NULL;\n  PyObject *__pyx_t_6 = NULL;\n  PyObject *__pyx_t_7 = NULL;\n  PyObject *__pyx_t_8 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"import_umath\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":948\n * \n * cdef inline int import_umath() except -1:\n *     try:             # <<<<<<<<<<<<<<\n *         _import_umath()\n *     except Exception:\n */\n  {\n    __Pyx_PyThreadState_declare\n    __Pyx_PyThreadState_assign\n    __Pyx_ExceptionSave(&__pyx_t_1, &__pyx_t_2, &__pyx_t_3);\n    __Pyx_XGOTREF(__pyx_t_1);\n    __Pyx_XGOTREF(__pyx_t_2);\n    __Pyx_XGOTREF(__pyx_t_3);\n    /*try:*/ {\n\n      /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":949\n * cdef inline int import_umath() except -1:\n *     try:\n *         _import_umath()             # <<<<<<<<<<<<<<\n *     except Exception:\n *         raise ImportError(\"numpy.core.umath failed to import\")\n */\n      __pyx_t_4 = _import_umath(); if (unlikely(__pyx_t_4 == ((int)-1))) __PYX_ERR(1, 949, __pyx_L3_error)\n\n      /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":948\n * \n * cdef inline int import_umath() except -1:\n *     try:             # <<<<<<<<<<<<<<\n *         _import_umath()\n *     except Exception:\n */\n    }\n    __Pyx_XDECREF(__pyx_t_1); __pyx_t_1 = 0;\n    __Pyx_XDECREF(__pyx_t_2); __pyx_t_2 = 0;\n    __Pyx_XDECREF(__pyx_t_3); __pyx_t_3 = 0;\n    goto __pyx_L8_try_end;\n    __pyx_L3_error:;\n\n    /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":950\n *     try:\n *         _import_umath()\n *     except Exception:             # <<<<<<<<<<<<<<\n *         raise ImportError(\"numpy.core.umath failed to import\")\n * \n */\n    __pyx_t_4 = __Pyx_PyErr_ExceptionMatches(((PyObject *)(&((PyTypeObject*)PyExc_Exception)[0])));\n    if (__pyx_t_4) {\n      __Pyx_AddTraceback(\"numpy.import_umath\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n      if (__Pyx_GetException(&__pyx_t_5, &__pyx_t_6, &__pyx_t_7) < 0) __PYX_ERR(1, 950, __pyx_L5_except_error)\n      __Pyx_GOTREF(__pyx_t_5);\n      __Pyx_GOTREF(__pyx_t_6);\n      __Pyx_GOTREF(__pyx_t_7);\n\n      /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":951\n *         _import_umath()\n *     except Exception:\n *         raise ImportError(\"numpy.core.umath failed to import\")             # <<<<<<<<<<<<<<\n * \n * cdef inline int import_ufunc() except -1:\n */\n      __pyx_t_8 = __Pyx_PyObject_Call(__pyx_builtin_ImportError, __pyx_tuple__3, NULL); if (unlikely(!__pyx_t_8)) __PYX_ERR(1, 951, __pyx_L5_except_error)\n      __Pyx_GOTREF(__pyx_t_8);\n      __Pyx_Raise(__pyx_t_8, 0, 0, 0);\n      __Pyx_DECREF(__pyx_t_8); __pyx_t_8 = 0;\n      __PYX_ERR(1, 951, __pyx_L5_except_error)\n    }\n    goto __pyx_L5_except_error;\n    __pyx_L5_except_error:;\n\n    /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":948\n * \n * cdef inline int import_umath() except -1:\n *     try:             # <<<<<<<<<<<<<<\n *         _import_umath()\n *     except Exception:\n */\n    __Pyx_XGIVEREF(__pyx_t_1);\n    __Pyx_XGIVEREF(__pyx_t_2);\n    __Pyx_XGIVEREF(__pyx_t_3);\n    __Pyx_ExceptionReset(__pyx_t_1, __pyx_t_2, __pyx_t_3);\n    goto __pyx_L1_error;\n    __pyx_L8_try_end:;\n  }\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":947\n *         raise ImportError(\"numpy.core.multiarray failed to import\")\n * \n * cdef inline int import_umath() except -1:             # <<<<<<<<<<<<<<\n *     try:\n *         _import_umath()\n */\n\n  /* function exit code */\n  __pyx_r = 0;\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_XDECREF(__pyx_t_6);\n  __Pyx_XDECREF(__pyx_t_7);\n  __Pyx_XDECREF(__pyx_t_8);\n  __Pyx_AddTraceback(\"numpy.import_umath\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = -1;\n  __pyx_L0:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":953\n *         raise ImportError(\"numpy.core.umath failed to import\")\n * \n * cdef inline int import_ufunc() except -1:             # <<<<<<<<<<<<<<\n *     try:\n *         _import_umath()\n */\n\nstatic CYTHON_INLINE int __pyx_f_5numpy_import_ufunc(void) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  int __pyx_t_4;\n  PyObject *__pyx_t_5 = NULL;\n  PyObject *__pyx_t_6 = NULL;\n  PyObject *__pyx_t_7 = NULL;\n  PyObject *__pyx_t_8 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"import_ufunc\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":954\n * \n * cdef inline int import_ufunc() except -1:\n *     try:             # <<<<<<<<<<<<<<\n *         _import_umath()\n *     except Exception:\n */\n  {\n    __Pyx_PyThreadState_declare\n    __Pyx_PyThreadState_assign\n    __Pyx_ExceptionSave(&__pyx_t_1, &__pyx_t_2, &__pyx_t_3);\n    __Pyx_XGOTREF(__pyx_t_1);\n    __Pyx_XGOTREF(__pyx_t_2);\n    __Pyx_XGOTREF(__pyx_t_3);\n    /*try:*/ {\n\n      /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":955\n * cdef inline int import_ufunc() except -1:\n *     try:\n *         _import_umath()             # <<<<<<<<<<<<<<\n *     except Exception:\n *         raise ImportError(\"numpy.core.umath failed to import\")\n */\n      __pyx_t_4 = _import_umath(); if (unlikely(__pyx_t_4 == ((int)-1))) __PYX_ERR(1, 955, __pyx_L3_error)\n\n      /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":954\n * \n * cdef inline int import_ufunc() except -1:\n *     try:             # <<<<<<<<<<<<<<\n *         _import_umath()\n *     except Exception:\n */\n    }\n    __Pyx_XDECREF(__pyx_t_1); __pyx_t_1 = 0;\n    __Pyx_XDECREF(__pyx_t_2); __pyx_t_2 = 0;\n    __Pyx_XDECREF(__pyx_t_3); __pyx_t_3 = 0;\n    goto __pyx_L8_try_end;\n    __pyx_L3_error:;\n\n    /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":956\n *     try:\n *         _import_umath()\n *     except Exception:             # <<<<<<<<<<<<<<\n *         raise ImportError(\"numpy.core.umath failed to import\")\n * \n */\n    __pyx_t_4 = __Pyx_PyErr_ExceptionMatches(((PyObject *)(&((PyTypeObject*)PyExc_Exception)[0])));\n    if (__pyx_t_4) {\n      __Pyx_AddTraceback(\"numpy.import_ufunc\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n      if (__Pyx_GetException(&__pyx_t_5, &__pyx_t_6, &__pyx_t_7) < 0) __PYX_ERR(1, 956, __pyx_L5_except_error)\n      __Pyx_GOTREF(__pyx_t_5);\n      __Pyx_GOTREF(__pyx_t_6);\n      __Pyx_GOTREF(__pyx_t_7);\n\n      /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":957\n *         _import_umath()\n *     except Exception:\n *         raise ImportError(\"numpy.core.umath failed to import\")             # <<<<<<<<<<<<<<\n * \n * cdef extern from *:\n */\n      __pyx_t_8 = __Pyx_PyObject_Call(__pyx_builtin_ImportError, __pyx_tuple__3, NULL); if (unlikely(!__pyx_t_8)) __PYX_ERR(1, 957, __pyx_L5_except_error)\n      __Pyx_GOTREF(__pyx_t_8);\n      __Pyx_Raise(__pyx_t_8, 0, 0, 0);\n      __Pyx_DECREF(__pyx_t_8); __pyx_t_8 = 0;\n      __PYX_ERR(1, 957, __pyx_L5_except_error)\n    }\n    goto __pyx_L5_except_error;\n    __pyx_L5_except_error:;\n\n    /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":954\n * \n * cdef inline int import_ufunc() except -1:\n *     try:             # <<<<<<<<<<<<<<\n *         _import_umath()\n *     except Exception:\n */\n    __Pyx_XGIVEREF(__pyx_t_1);\n    __Pyx_XGIVEREF(__pyx_t_2);\n    __Pyx_XGIVEREF(__pyx_t_3);\n    __Pyx_ExceptionReset(__pyx_t_1, __pyx_t_2, __pyx_t_3);\n    goto __pyx_L1_error;\n    __pyx_L8_try_end:;\n  }\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":953\n *         raise ImportError(\"numpy.core.umath failed to import\")\n * \n * cdef inline int import_ufunc() except -1:             # <<<<<<<<<<<<<<\n *     try:\n *         _import_umath()\n */\n\n  /* function exit code */\n  __pyx_r = 0;\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_XDECREF(__pyx_t_6);\n  __Pyx_XDECREF(__pyx_t_7);\n  __Pyx_XDECREF(__pyx_t_8);\n  __Pyx_AddTraceback(\"numpy.import_ufunc\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = -1;\n  __pyx_L0:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":967\n * \n * \n * cdef inline bint is_timedelta64_object(object obj):             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     Cython equivalent of `isinstance(obj, np.timedelta64)`\n */\n\nstatic CYTHON_INLINE int __pyx_f_5numpy_is_timedelta64_object(PyObject *__pyx_v_obj) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"is_timedelta64_object\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":979\n *     bool\n *     \"\"\"\n *     return PyObject_TypeCheck(obj, &PyTimedeltaArrType_Type)             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_r = PyObject_TypeCheck(__pyx_v_obj, (&PyTimedeltaArrType_Type));\n  goto __pyx_L0;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":967\n * \n * \n * cdef inline bint is_timedelta64_object(object obj):             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     Cython equivalent of `isinstance(obj, np.timedelta64)`\n */\n\n  /* function exit code */\n  __pyx_L0:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":982\n * \n * \n * cdef inline bint is_datetime64_object(object obj):             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     Cython equivalent of `isinstance(obj, np.datetime64)`\n */\n\nstatic CYTHON_INLINE int __pyx_f_5numpy_is_datetime64_object(PyObject *__pyx_v_obj) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"is_datetime64_object\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":994\n *     bool\n *     \"\"\"\n *     return PyObject_TypeCheck(obj, &PyDatetimeArrType_Type)             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_r = PyObject_TypeCheck(__pyx_v_obj, (&PyDatetimeArrType_Type));\n  goto __pyx_L0;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":982\n * \n * \n * cdef inline bint is_datetime64_object(object obj):             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     Cython equivalent of `isinstance(obj, np.datetime64)`\n */\n\n  /* function exit code */\n  __pyx_L0:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":997\n * \n * \n * cdef inline npy_datetime get_datetime64_value(object obj) nogil:             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     returns the int64 value underlying scalar numpy datetime64 object\n */\n\nstatic CYTHON_INLINE npy_datetime __pyx_f_5numpy_get_datetime64_value(PyObject *__pyx_v_obj) {\n  npy_datetime __pyx_r;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":1004\n *     also needed.  That can be found using `get_datetime64_unit`.\n *     \"\"\"\n *     return (<PyDatetimeScalarObject*>obj).obval             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_r = ((PyDatetimeScalarObject *)__pyx_v_obj)->obval;\n  goto __pyx_L0;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":997\n * \n * \n * cdef inline npy_datetime get_datetime64_value(object obj) nogil:             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     returns the int64 value underlying scalar numpy datetime64 object\n */\n\n  /* function exit code */\n  __pyx_L0:;\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":1007\n * \n * \n * cdef inline npy_timedelta get_timedelta64_value(object obj) nogil:             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     returns the int64 value underlying scalar numpy timedelta64 object\n */\n\nstatic CYTHON_INLINE npy_timedelta __pyx_f_5numpy_get_timedelta64_value(PyObject *__pyx_v_obj) {\n  npy_timedelta __pyx_r;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":1011\n *     returns the int64 value underlying scalar numpy timedelta64 object\n *     \"\"\"\n *     return (<PyTimedeltaScalarObject*>obj).obval             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_r = ((PyTimedeltaScalarObject *)__pyx_v_obj)->obval;\n  goto __pyx_L0;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":1007\n * \n * \n * cdef inline npy_timedelta get_timedelta64_value(object obj) nogil:             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     returns the int64 value underlying scalar numpy timedelta64 object\n */\n\n  /* function exit code */\n  __pyx_L0:;\n  return __pyx_r;\n}\n\n/* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":1014\n * \n * \n * cdef inline NPY_DATETIMEUNIT get_datetime64_unit(object obj) nogil:             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     returns the unit part of the dtype for a numpy datetime64 object.\n */\n\nstatic CYTHON_INLINE NPY_DATETIMEUNIT __pyx_f_5numpy_get_datetime64_unit(PyObject *__pyx_v_obj) {\n  NPY_DATETIMEUNIT __pyx_r;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":1018\n *     returns the unit part of the dtype for a numpy datetime64 object.\n *     \"\"\"\n *     return <NPY_DATETIMEUNIT>(<PyDatetimeScalarObject*>obj).obmeta.base             # <<<<<<<<<<<<<<\n */\n  __pyx_r = ((NPY_DATETIMEUNIT)((PyDatetimeScalarObject *)__pyx_v_obj)->obmeta.base);\n  goto __pyx_L0;\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":1014\n * \n * \n * cdef inline NPY_DATETIMEUNIT get_datetime64_unit(object obj) nogil:             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     returns the unit part of the dtype for a numpy datetime64 object.\n */\n\n  /* function exit code */\n  __pyx_L0:;\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":122\n *         cdef bint dtype_is_object\n * \n *     def __cinit__(array self, tuple shape, Py_ssize_t itemsize, format not None,             # <<<<<<<<<<<<<<\n *                   mode=\"c\", bint allocate_buffer=True):\n * \n */\n\n/* Python wrapper */\nstatic int __pyx_array___cinit__(PyObject *__pyx_v_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/\nstatic int __pyx_array___cinit__(PyObject *__pyx_v_self, PyObject *__pyx_args, PyObject *__pyx_kwds) {\n  PyObject *__pyx_v_shape = 0;\n  Py_ssize_t __pyx_v_itemsize;\n  PyObject *__pyx_v_format = 0;\n  PyObject *__pyx_v_mode = 0;\n  int __pyx_v_allocate_buffer;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__cinit__ (wrapper)\", 0);\n  {\n    static PyObject **__pyx_pyargnames[] = {&__pyx_n_s_shape,&__pyx_n_s_itemsize,&__pyx_n_s_format,&__pyx_n_s_mode,&__pyx_n_s_allocate_buffer,0};\n    PyObject* values[5] = {0,0,0,0,0};\n    values[3] = ((PyObject *)__pyx_n_s_c);\n    if (unlikely(__pyx_kwds)) {\n      Py_ssize_t kw_args;\n      const Py_ssize_t pos_args = PyTuple_GET_SIZE(__pyx_args);\n      switch (pos_args) {\n        case  5: values[4] = PyTuple_GET_ITEM(__pyx_args, 4);\n        CYTHON_FALLTHROUGH;\n        case  4: values[3] = PyTuple_GET_ITEM(__pyx_args, 3);\n        CYTHON_FALLTHROUGH;\n        case  3: values[2] = PyTuple_GET_ITEM(__pyx_args, 2);\n        CYTHON_FALLTHROUGH;\n        case  2: values[1] = PyTuple_GET_ITEM(__pyx_args, 1);\n        CYTHON_FALLTHROUGH;\n        case  1: values[0] = PyTuple_GET_ITEM(__pyx_args, 0);\n        CYTHON_FALLTHROUGH;\n        case  0: break;\n        default: goto __pyx_L5_argtuple_error;\n      }\n      kw_args = PyDict_Size(__pyx_kwds);\n      switch (pos_args) {\n        case  0:\n        if (likely((values[0] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_shape)) != 0)) kw_args--;\n        else goto __pyx_L5_argtuple_error;\n        CYTHON_FALLTHROUGH;\n        case  1:\n        if (likely((values[1] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_itemsize)) != 0)) kw_args--;\n        else {\n          __Pyx_RaiseArgtupleInvalid(\"__cinit__\", 0, 3, 5, 1); __PYX_ERR(2, 122, __pyx_L3_error)\n        }\n        CYTHON_FALLTHROUGH;\n        case  2:\n        if (likely((values[2] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_format)) != 0)) kw_args--;\n        else {\n          __Pyx_RaiseArgtupleInvalid(\"__cinit__\", 0, 3, 5, 2); __PYX_ERR(2, 122, __pyx_L3_error)\n        }\n        CYTHON_FALLTHROUGH;\n        case  3:\n        if (kw_args > 0) {\n          PyObject* value = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_mode);\n          if (value) { values[3] = value; kw_args--; }\n        }\n        CYTHON_FALLTHROUGH;\n        case  4:\n        if (kw_args > 0) {\n          PyObject* value = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_allocate_buffer);\n          if (value) { values[4] = value; kw_args--; }\n        }\n      }\n      if (unlikely(kw_args > 0)) {\n        if (unlikely(__Pyx_ParseOptionalKeywords(__pyx_kwds, __pyx_pyargnames, 0, values, pos_args, \"__cinit__\") < 0)) __PYX_ERR(2, 122, __pyx_L3_error)\n      }\n    } else {\n      switch (PyTuple_GET_SIZE(__pyx_args)) {\n        case  5: values[4] = PyTuple_GET_ITEM(__pyx_args, 4);\n        CYTHON_FALLTHROUGH;\n        case  4: values[3] = PyTuple_GET_ITEM(__pyx_args, 3);\n        CYTHON_FALLTHROUGH;\n        case  3: values[2] = PyTuple_GET_ITEM(__pyx_args, 2);\n        values[1] = PyTuple_GET_ITEM(__pyx_args, 1);\n        values[0] = PyTuple_GET_ITEM(__pyx_args, 0);\n        break;\n        default: goto __pyx_L5_argtuple_error;\n      }\n    }\n    __pyx_v_shape = ((PyObject*)values[0]);\n    __pyx_v_itemsize = __Pyx_PyIndex_AsSsize_t(values[1]); if (unlikely((__pyx_v_itemsize == (Py_ssize_t)-1) && PyErr_Occurred())) __PYX_ERR(2, 122, __pyx_L3_error)\n    __pyx_v_format = values[2];\n    __pyx_v_mode = values[3];\n    if (values[4]) {\n      __pyx_v_allocate_buffer = __Pyx_PyObject_IsTrue(values[4]); if (unlikely((__pyx_v_allocate_buffer == (int)-1) && PyErr_Occurred())) __PYX_ERR(2, 123, __pyx_L3_error)\n    } else {\n\n      /* \"View.MemoryView\":123\n * \n *     def __cinit__(array self, tuple shape, Py_ssize_t itemsize, format not None,\n *                   mode=\"c\", bint allocate_buffer=True):             # <<<<<<<<<<<<<<\n * \n *         cdef int idx\n */\n      __pyx_v_allocate_buffer = ((int)1);\n    }\n  }\n  goto __pyx_L4_argument_unpacking_done;\n  __pyx_L5_argtuple_error:;\n  __Pyx_RaiseArgtupleInvalid(\"__cinit__\", 0, 3, 5, PyTuple_GET_SIZE(__pyx_args)); __PYX_ERR(2, 122, __pyx_L3_error)\n  __pyx_L3_error:;\n  __Pyx_AddTraceback(\"View.MemoryView.array.__cinit__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __Pyx_RefNannyFinishContext();\n  return -1;\n  __pyx_L4_argument_unpacking_done:;\n  if (unlikely(!__Pyx_ArgTypeTest(((PyObject *)__pyx_v_shape), (&PyTuple_Type), 1, \"shape\", 1))) __PYX_ERR(2, 122, __pyx_L1_error)\n  if (unlikely(((PyObject *)__pyx_v_format) == Py_None)) {\n    PyErr_Format(PyExc_TypeError, \"Argument '%.200s' must not be None\", \"format\"); __PYX_ERR(2, 122, __pyx_L1_error)\n  }\n  __pyx_r = __pyx_array___pyx_pf_15View_dot_MemoryView_5array___cinit__(((struct __pyx_array_obj *)__pyx_v_self), __pyx_v_shape, __pyx_v_itemsize, __pyx_v_format, __pyx_v_mode, __pyx_v_allocate_buffer);\n\n  /* \"View.MemoryView\":122\n *         cdef bint dtype_is_object\n * \n *     def __cinit__(array self, tuple shape, Py_ssize_t itemsize, format not None,             # <<<<<<<<<<<<<<\n *                   mode=\"c\", bint allocate_buffer=True):\n * \n */\n\n  /* function exit code */\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __pyx_r = -1;\n  __pyx_L0:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic int __pyx_array___pyx_pf_15View_dot_MemoryView_5array___cinit__(struct __pyx_array_obj *__pyx_v_self, PyObject *__pyx_v_shape, Py_ssize_t __pyx_v_itemsize, PyObject *__pyx_v_format, PyObject *__pyx_v_mode, int __pyx_v_allocate_buffer) {\n  int __pyx_v_idx;\n  Py_ssize_t __pyx_v_i;\n  Py_ssize_t __pyx_v_dim;\n  PyObject **__pyx_v_p;\n  char __pyx_v_order;\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  Py_ssize_t __pyx_t_1;\n  int __pyx_t_2;\n  PyObject *__pyx_t_3 = NULL;\n  int __pyx_t_4;\n  PyObject *__pyx_t_5 = NULL;\n  PyObject *__pyx_t_6 = NULL;\n  char *__pyx_t_7;\n  int __pyx_t_8;\n  Py_ssize_t __pyx_t_9;\n  PyObject *__pyx_t_10 = NULL;\n  Py_ssize_t __pyx_t_11;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__cinit__\", 0);\n  __Pyx_INCREF(__pyx_v_format);\n\n  /* \"View.MemoryView\":129\n *         cdef PyObject **p\n * \n *         self.ndim = <int> len(shape)             # <<<<<<<<<<<<<<\n *         self.itemsize = itemsize\n * \n */\n  if (unlikely(__pyx_v_shape == Py_None)) {\n    PyErr_SetString(PyExc_TypeError, \"object of type 'NoneType' has no len()\");\n    __PYX_ERR(2, 129, __pyx_L1_error)\n  }\n  __pyx_t_1 = PyTuple_GET_SIZE(__pyx_v_shape); if (unlikely(__pyx_t_1 == ((Py_ssize_t)-1))) __PYX_ERR(2, 129, __pyx_L1_error)\n  __pyx_v_self->ndim = ((int)__pyx_t_1);\n\n  /* \"View.MemoryView\":130\n * \n *         self.ndim = <int> len(shape)\n *         self.itemsize = itemsize             # <<<<<<<<<<<<<<\n * \n *         if not self.ndim:\n */\n  __pyx_v_self->itemsize = __pyx_v_itemsize;\n\n  /* \"View.MemoryView\":132\n *         self.itemsize = itemsize\n * \n *         if not self.ndim:             # <<<<<<<<<<<<<<\n *             raise ValueError(\"Empty shape tuple for cython.array\")\n * \n */\n  __pyx_t_2 = ((!(__pyx_v_self->ndim != 0)) != 0);\n  if (unlikely(__pyx_t_2)) {\n\n    /* \"View.MemoryView\":133\n * \n *         if not self.ndim:\n *             raise ValueError(\"Empty shape tuple for cython.array\")             # <<<<<<<<<<<<<<\n * \n *         if itemsize <= 0:\n */\n    __pyx_t_3 = __Pyx_PyObject_Call(__pyx_builtin_ValueError, __pyx_tuple__4, NULL); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 133, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_Raise(__pyx_t_3, 0, 0, 0);\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n    __PYX_ERR(2, 133, __pyx_L1_error)\n\n    /* \"View.MemoryView\":132\n *         self.itemsize = itemsize\n * \n *         if not self.ndim:             # <<<<<<<<<<<<<<\n *             raise ValueError(\"Empty shape tuple for cython.array\")\n * \n */\n  }\n\n  /* \"View.MemoryView\":135\n *             raise ValueError(\"Empty shape tuple for cython.array\")\n * \n *         if itemsize <= 0:             # <<<<<<<<<<<<<<\n *             raise ValueError(\"itemsize <= 0 for cython.array\")\n * \n */\n  __pyx_t_2 = ((__pyx_v_itemsize <= 0) != 0);\n  if (unlikely(__pyx_t_2)) {\n\n    /* \"View.MemoryView\":136\n * \n *         if itemsize <= 0:\n *             raise ValueError(\"itemsize <= 0 for cython.array\")             # <<<<<<<<<<<<<<\n * \n *         if not isinstance(format, bytes):\n */\n    __pyx_t_3 = __Pyx_PyObject_Call(__pyx_builtin_ValueError, __pyx_tuple__5, NULL); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 136, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_Raise(__pyx_t_3, 0, 0, 0);\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n    __PYX_ERR(2, 136, __pyx_L1_error)\n\n    /* \"View.MemoryView\":135\n *             raise ValueError(\"Empty shape tuple for cython.array\")\n * \n *         if itemsize <= 0:             # <<<<<<<<<<<<<<\n *             raise ValueError(\"itemsize <= 0 for cython.array\")\n * \n */\n  }\n\n  /* \"View.MemoryView\":138\n *             raise ValueError(\"itemsize <= 0 for cython.array\")\n * \n *         if not isinstance(format, bytes):             # <<<<<<<<<<<<<<\n *             format = format.encode('ASCII')\n *         self._format = format  # keep a reference to the byte string\n */\n  __pyx_t_2 = PyBytes_Check(__pyx_v_format); \n  __pyx_t_4 = ((!(__pyx_t_2 != 0)) != 0);\n  if (__pyx_t_4) {\n\n    /* \"View.MemoryView\":139\n * \n *         if not isinstance(format, bytes):\n *             format = format.encode('ASCII')             # <<<<<<<<<<<<<<\n *         self._format = format  # keep a reference to the byte string\n *         self.format = self._format\n */\n    __pyx_t_5 = __Pyx_PyObject_GetAttrStr(__pyx_v_format, __pyx_n_s_encode); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 139, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_5);\n    __pyx_t_6 = NULL;\n    if (CYTHON_UNPACK_METHODS && likely(PyMethod_Check(__pyx_t_5))) {\n      __pyx_t_6 = PyMethod_GET_SELF(__pyx_t_5);\n      if (likely(__pyx_t_6)) {\n        PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_5);\n        __Pyx_INCREF(__pyx_t_6);\n        __Pyx_INCREF(function);\n        __Pyx_DECREF_SET(__pyx_t_5, function);\n      }\n    }\n    __pyx_t_3 = (__pyx_t_6) ? __Pyx_PyObject_Call2Args(__pyx_t_5, __pyx_t_6, __pyx_n_s_ASCII) : __Pyx_PyObject_CallOneArg(__pyx_t_5, __pyx_n_s_ASCII);\n    __Pyx_XDECREF(__pyx_t_6); __pyx_t_6 = 0;\n    if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 139, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0;\n    __Pyx_DECREF_SET(__pyx_v_format, __pyx_t_3);\n    __pyx_t_3 = 0;\n\n    /* \"View.MemoryView\":138\n *             raise ValueError(\"itemsize <= 0 for cython.array\")\n * \n *         if not isinstance(format, bytes):             # <<<<<<<<<<<<<<\n *             format = format.encode('ASCII')\n *         self._format = format  # keep a reference to the byte string\n */\n  }\n\n  /* \"View.MemoryView\":140\n *         if not isinstance(format, bytes):\n *             format = format.encode('ASCII')\n *         self._format = format  # keep a reference to the byte string             # <<<<<<<<<<<<<<\n *         self.format = self._format\n * \n */\n  if (!(likely(PyBytes_CheckExact(__pyx_v_format))||((__pyx_v_format) == Py_None)||(PyErr_Format(PyExc_TypeError, \"Expected %.16s, got %.200s\", \"bytes\", Py_TYPE(__pyx_v_format)->tp_name), 0))) __PYX_ERR(2, 140, __pyx_L1_error)\n  __pyx_t_3 = __pyx_v_format;\n  __Pyx_INCREF(__pyx_t_3);\n  __Pyx_GIVEREF(__pyx_t_3);\n  __Pyx_GOTREF(__pyx_v_self->_format);\n  __Pyx_DECREF(__pyx_v_self->_format);\n  __pyx_v_self->_format = ((PyObject*)__pyx_t_3);\n  __pyx_t_3 = 0;\n\n  /* \"View.MemoryView\":141\n *             format = format.encode('ASCII')\n *         self._format = format  # keep a reference to the byte string\n *         self.format = self._format             # <<<<<<<<<<<<<<\n * \n * \n */\n  if (unlikely(__pyx_v_self->_format == Py_None)) {\n    PyErr_SetString(PyExc_TypeError, \"expected bytes, NoneType found\");\n    __PYX_ERR(2, 141, __pyx_L1_error)\n  }\n  __pyx_t_7 = __Pyx_PyBytes_AsWritableString(__pyx_v_self->_format); if (unlikely((!__pyx_t_7) && PyErr_Occurred())) __PYX_ERR(2, 141, __pyx_L1_error)\n  __pyx_v_self->format = __pyx_t_7;\n\n  /* \"View.MemoryView\":144\n * \n * \n *         self._shape = <Py_ssize_t *> PyObject_Malloc(sizeof(Py_ssize_t)*self.ndim*2)             # <<<<<<<<<<<<<<\n *         self._strides = self._shape + self.ndim\n * \n */\n  __pyx_v_self->_shape = ((Py_ssize_t *)PyObject_Malloc((((sizeof(Py_ssize_t)) * __pyx_v_self->ndim) * 2)));\n\n  /* \"View.MemoryView\":145\n * \n *         self._shape = <Py_ssize_t *> PyObject_Malloc(sizeof(Py_ssize_t)*self.ndim*2)\n *         self._strides = self._shape + self.ndim             # <<<<<<<<<<<<<<\n * \n *         if not self._shape:\n */\n  __pyx_v_self->_strides = (__pyx_v_self->_shape + __pyx_v_self->ndim);\n\n  /* \"View.MemoryView\":147\n *         self._strides = self._shape + self.ndim\n * \n *         if not self._shape:             # <<<<<<<<<<<<<<\n *             raise MemoryError(\"unable to allocate shape and strides.\")\n * \n */\n  __pyx_t_4 = ((!(__pyx_v_self->_shape != 0)) != 0);\n  if (unlikely(__pyx_t_4)) {\n\n    /* \"View.MemoryView\":148\n * \n *         if not self._shape:\n *             raise MemoryError(\"unable to allocate shape and strides.\")             # <<<<<<<<<<<<<<\n * \n * \n */\n    __pyx_t_3 = __Pyx_PyObject_Call(__pyx_builtin_MemoryError, __pyx_tuple__6, NULL); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 148, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_Raise(__pyx_t_3, 0, 0, 0);\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n    __PYX_ERR(2, 148, __pyx_L1_error)\n\n    /* \"View.MemoryView\":147\n *         self._strides = self._shape + self.ndim\n * \n *         if not self._shape:             # <<<<<<<<<<<<<<\n *             raise MemoryError(\"unable to allocate shape and strides.\")\n * \n */\n  }\n\n  /* \"View.MemoryView\":151\n * \n * \n *         for idx, dim in enumerate(shape):             # <<<<<<<<<<<<<<\n *             if dim <= 0:\n *                 raise ValueError(\"Invalid shape in axis %d: %d.\" % (idx, dim))\n */\n  __pyx_t_8 = 0;\n  __pyx_t_3 = __pyx_v_shape; __Pyx_INCREF(__pyx_t_3); __pyx_t_1 = 0;\n  for (;;) {\n    if (__pyx_t_1 >= PyTuple_GET_SIZE(__pyx_t_3)) break;\n    #if CYTHON_ASSUME_SAFE_MACROS && !CYTHON_AVOID_BORROWED_REFS\n    __pyx_t_5 = PyTuple_GET_ITEM(__pyx_t_3, __pyx_t_1); __Pyx_INCREF(__pyx_t_5); __pyx_t_1++; if (unlikely(0 < 0)) __PYX_ERR(2, 151, __pyx_L1_error)\n    #else\n    __pyx_t_5 = PySequence_ITEM(__pyx_t_3, __pyx_t_1); __pyx_t_1++; if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 151, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_5);\n    #endif\n    __pyx_t_9 = __Pyx_PyIndex_AsSsize_t(__pyx_t_5); if (unlikely((__pyx_t_9 == (Py_ssize_t)-1) && PyErr_Occurred())) __PYX_ERR(2, 151, __pyx_L1_error)\n    __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0;\n    __pyx_v_dim = __pyx_t_9;\n    __pyx_v_idx = __pyx_t_8;\n    __pyx_t_8 = (__pyx_t_8 + 1);\n\n    /* \"View.MemoryView\":152\n * \n *         for idx, dim in enumerate(shape):\n *             if dim <= 0:             # <<<<<<<<<<<<<<\n *                 raise ValueError(\"Invalid shape in axis %d: %d.\" % (idx, dim))\n *             self._shape[idx] = dim\n */\n    __pyx_t_4 = ((__pyx_v_dim <= 0) != 0);\n    if (unlikely(__pyx_t_4)) {\n\n      /* \"View.MemoryView\":153\n *         for idx, dim in enumerate(shape):\n *             if dim <= 0:\n *                 raise ValueError(\"Invalid shape in axis %d: %d.\" % (idx, dim))             # <<<<<<<<<<<<<<\n *             self._shape[idx] = dim\n * \n */\n      __pyx_t_5 = __Pyx_PyInt_From_int(__pyx_v_idx); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 153, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_5);\n      __pyx_t_6 = PyInt_FromSsize_t(__pyx_v_dim); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 153, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_6);\n      __pyx_t_10 = PyTuple_New(2); if (unlikely(!__pyx_t_10)) __PYX_ERR(2, 153, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_10);\n      __Pyx_GIVEREF(__pyx_t_5);\n      PyTuple_SET_ITEM(__pyx_t_10, 0, __pyx_t_5);\n      __Pyx_GIVEREF(__pyx_t_6);\n      PyTuple_SET_ITEM(__pyx_t_10, 1, __pyx_t_6);\n      __pyx_t_5 = 0;\n      __pyx_t_6 = 0;\n      __pyx_t_6 = __Pyx_PyString_Format(__pyx_kp_s_Invalid_shape_in_axis_d_d, __pyx_t_10); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 153, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_6);\n      __Pyx_DECREF(__pyx_t_10); __pyx_t_10 = 0;\n      __pyx_t_10 = __Pyx_PyObject_CallOneArg(__pyx_builtin_ValueError, __pyx_t_6); if (unlikely(!__pyx_t_10)) __PYX_ERR(2, 153, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_10);\n      __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;\n      __Pyx_Raise(__pyx_t_10, 0, 0, 0);\n      __Pyx_DECREF(__pyx_t_10); __pyx_t_10 = 0;\n      __PYX_ERR(2, 153, __pyx_L1_error)\n\n      /* \"View.MemoryView\":152\n * \n *         for idx, dim in enumerate(shape):\n *             if dim <= 0:             # <<<<<<<<<<<<<<\n *                 raise ValueError(\"Invalid shape in axis %d: %d.\" % (idx, dim))\n *             self._shape[idx] = dim\n */\n    }\n\n    /* \"View.MemoryView\":154\n *             if dim <= 0:\n *                 raise ValueError(\"Invalid shape in axis %d: %d.\" % (idx, dim))\n *             self._shape[idx] = dim             # <<<<<<<<<<<<<<\n * \n *         cdef char order\n */\n    (__pyx_v_self->_shape[__pyx_v_idx]) = __pyx_v_dim;\n\n    /* \"View.MemoryView\":151\n * \n * \n *         for idx, dim in enumerate(shape):             # <<<<<<<<<<<<<<\n *             if dim <= 0:\n *                 raise ValueError(\"Invalid shape in axis %d: %d.\" % (idx, dim))\n */\n  }\n  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n\n  /* \"View.MemoryView\":157\n * \n *         cdef char order\n *         if mode == 'fortran':             # <<<<<<<<<<<<<<\n *             order = b'F'\n *             self.mode = u'fortran'\n */\n  __pyx_t_4 = (__Pyx_PyString_Equals(__pyx_v_mode, __pyx_n_s_fortran, Py_EQ)); if (unlikely(__pyx_t_4 < 0)) __PYX_ERR(2, 157, __pyx_L1_error)\n  if (__pyx_t_4) {\n\n    /* \"View.MemoryView\":158\n *         cdef char order\n *         if mode == 'fortran':\n *             order = b'F'             # <<<<<<<<<<<<<<\n *             self.mode = u'fortran'\n *         elif mode == 'c':\n */\n    __pyx_v_order = 'F';\n\n    /* \"View.MemoryView\":159\n *         if mode == 'fortran':\n *             order = b'F'\n *             self.mode = u'fortran'             # <<<<<<<<<<<<<<\n *         elif mode == 'c':\n *             order = b'C'\n */\n    __Pyx_INCREF(__pyx_n_u_fortran);\n    __Pyx_GIVEREF(__pyx_n_u_fortran);\n    __Pyx_GOTREF(__pyx_v_self->mode);\n    __Pyx_DECREF(__pyx_v_self->mode);\n    __pyx_v_self->mode = __pyx_n_u_fortran;\n\n    /* \"View.MemoryView\":157\n * \n *         cdef char order\n *         if mode == 'fortran':             # <<<<<<<<<<<<<<\n *             order = b'F'\n *             self.mode = u'fortran'\n */\n    goto __pyx_L10;\n  }\n\n  /* \"View.MemoryView\":160\n *             order = b'F'\n *             self.mode = u'fortran'\n *         elif mode == 'c':             # <<<<<<<<<<<<<<\n *             order = b'C'\n *             self.mode = u'c'\n */\n  __pyx_t_4 = (__Pyx_PyString_Equals(__pyx_v_mode, __pyx_n_s_c, Py_EQ)); if (unlikely(__pyx_t_4 < 0)) __PYX_ERR(2, 160, __pyx_L1_error)\n  if (likely(__pyx_t_4)) {\n\n    /* \"View.MemoryView\":161\n *             self.mode = u'fortran'\n *         elif mode == 'c':\n *             order = b'C'             # <<<<<<<<<<<<<<\n *             self.mode = u'c'\n *         else:\n */\n    __pyx_v_order = 'C';\n\n    /* \"View.MemoryView\":162\n *         elif mode == 'c':\n *             order = b'C'\n *             self.mode = u'c'             # <<<<<<<<<<<<<<\n *         else:\n *             raise ValueError(\"Invalid mode, expected 'c' or 'fortran', got %s\" % mode)\n */\n    __Pyx_INCREF(__pyx_n_u_c);\n    __Pyx_GIVEREF(__pyx_n_u_c);\n    __Pyx_GOTREF(__pyx_v_self->mode);\n    __Pyx_DECREF(__pyx_v_self->mode);\n    __pyx_v_self->mode = __pyx_n_u_c;\n\n    /* \"View.MemoryView\":160\n *             order = b'F'\n *             self.mode = u'fortran'\n *         elif mode == 'c':             # <<<<<<<<<<<<<<\n *             order = b'C'\n *             self.mode = u'c'\n */\n    goto __pyx_L10;\n  }\n\n  /* \"View.MemoryView\":164\n *             self.mode = u'c'\n *         else:\n *             raise ValueError(\"Invalid mode, expected 'c' or 'fortran', got %s\" % mode)             # <<<<<<<<<<<<<<\n * \n *         self.len = fill_contig_strides_array(self._shape, self._strides,\n */\n  /*else*/ {\n    __pyx_t_3 = __Pyx_PyString_FormatSafe(__pyx_kp_s_Invalid_mode_expected_c_or_fortr, __pyx_v_mode); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 164, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __pyx_t_10 = __Pyx_PyObject_CallOneArg(__pyx_builtin_ValueError, __pyx_t_3); if (unlikely(!__pyx_t_10)) __PYX_ERR(2, 164, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_10);\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n    __Pyx_Raise(__pyx_t_10, 0, 0, 0);\n    __Pyx_DECREF(__pyx_t_10); __pyx_t_10 = 0;\n    __PYX_ERR(2, 164, __pyx_L1_error)\n  }\n  __pyx_L10:;\n\n  /* \"View.MemoryView\":166\n *             raise ValueError(\"Invalid mode, expected 'c' or 'fortran', got %s\" % mode)\n * \n *         self.len = fill_contig_strides_array(self._shape, self._strides,             # <<<<<<<<<<<<<<\n *                                              itemsize, self.ndim, order)\n * \n */\n  __pyx_v_self->len = __pyx_fill_contig_strides_array(__pyx_v_self->_shape, __pyx_v_self->_strides, __pyx_v_itemsize, __pyx_v_self->ndim, __pyx_v_order);\n\n  /* \"View.MemoryView\":169\n *                                              itemsize, self.ndim, order)\n * \n *         self.free_data = allocate_buffer             # <<<<<<<<<<<<<<\n *         self.dtype_is_object = format == b'O'\n *         if allocate_buffer:\n */\n  __pyx_v_self->free_data = __pyx_v_allocate_buffer;\n\n  /* \"View.MemoryView\":170\n * \n *         self.free_data = allocate_buffer\n *         self.dtype_is_object = format == b'O'             # <<<<<<<<<<<<<<\n *         if allocate_buffer:\n * \n */\n  __pyx_t_10 = PyObject_RichCompare(__pyx_v_format, __pyx_n_b_O, Py_EQ); __Pyx_XGOTREF(__pyx_t_10); if (unlikely(!__pyx_t_10)) __PYX_ERR(2, 170, __pyx_L1_error)\n  __pyx_t_4 = __Pyx_PyObject_IsTrue(__pyx_t_10); if (unlikely((__pyx_t_4 == (int)-1) && PyErr_Occurred())) __PYX_ERR(2, 170, __pyx_L1_error)\n  __Pyx_DECREF(__pyx_t_10); __pyx_t_10 = 0;\n  __pyx_v_self->dtype_is_object = __pyx_t_4;\n\n  /* \"View.MemoryView\":171\n *         self.free_data = allocate_buffer\n *         self.dtype_is_object = format == b'O'\n *         if allocate_buffer:             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_t_4 = (__pyx_v_allocate_buffer != 0);\n  if (__pyx_t_4) {\n\n    /* \"View.MemoryView\":174\n * \n * \n *             self.data = <char *>malloc(self.len)             # <<<<<<<<<<<<<<\n *             if not self.data:\n *                 raise MemoryError(\"unable to allocate array data.\")\n */\n    __pyx_v_self->data = ((char *)malloc(__pyx_v_self->len));\n\n    /* \"View.MemoryView\":175\n * \n *             self.data = <char *>malloc(self.len)\n *             if not self.data:             # <<<<<<<<<<<<<<\n *                 raise MemoryError(\"unable to allocate array data.\")\n * \n */\n    __pyx_t_4 = ((!(__pyx_v_self->data != 0)) != 0);\n    if (unlikely(__pyx_t_4)) {\n\n      /* \"View.MemoryView\":176\n *             self.data = <char *>malloc(self.len)\n *             if not self.data:\n *                 raise MemoryError(\"unable to allocate array data.\")             # <<<<<<<<<<<<<<\n * \n *             if self.dtype_is_object:\n */\n      __pyx_t_10 = __Pyx_PyObject_Call(__pyx_builtin_MemoryError, __pyx_tuple__7, NULL); if (unlikely(!__pyx_t_10)) __PYX_ERR(2, 176, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_10);\n      __Pyx_Raise(__pyx_t_10, 0, 0, 0);\n      __Pyx_DECREF(__pyx_t_10); __pyx_t_10 = 0;\n      __PYX_ERR(2, 176, __pyx_L1_error)\n\n      /* \"View.MemoryView\":175\n * \n *             self.data = <char *>malloc(self.len)\n *             if not self.data:             # <<<<<<<<<<<<<<\n *                 raise MemoryError(\"unable to allocate array data.\")\n * \n */\n    }\n\n    /* \"View.MemoryView\":178\n *                 raise MemoryError(\"unable to allocate array data.\")\n * \n *             if self.dtype_is_object:             # <<<<<<<<<<<<<<\n *                 p = <PyObject **> self.data\n *                 for i in range(self.len / itemsize):\n */\n    __pyx_t_4 = (__pyx_v_self->dtype_is_object != 0);\n    if (__pyx_t_4) {\n\n      /* \"View.MemoryView\":179\n * \n *             if self.dtype_is_object:\n *                 p = <PyObject **> self.data             # <<<<<<<<<<<<<<\n *                 for i in range(self.len / itemsize):\n *                     p[i] = Py_None\n */\n      __pyx_v_p = ((PyObject **)__pyx_v_self->data);\n\n      /* \"View.MemoryView\":180\n *             if self.dtype_is_object:\n *                 p = <PyObject **> self.data\n *                 for i in range(self.len / itemsize):             # <<<<<<<<<<<<<<\n *                     p[i] = Py_None\n *                     Py_INCREF(Py_None)\n */\n      if (unlikely(__pyx_v_itemsize == 0)) {\n        PyErr_SetString(PyExc_ZeroDivisionError, \"integer division or modulo by zero\");\n        __PYX_ERR(2, 180, __pyx_L1_error)\n      }\n      else if (sizeof(Py_ssize_t) == sizeof(long) && (!(((Py_ssize_t)-1) > 0)) && unlikely(__pyx_v_itemsize == (Py_ssize_t)-1)  && unlikely(UNARY_NEG_WOULD_OVERFLOW(__pyx_v_self->len))) {\n        PyErr_SetString(PyExc_OverflowError, \"value too large to perform division\");\n        __PYX_ERR(2, 180, __pyx_L1_error)\n      }\n      __pyx_t_1 = __Pyx_div_Py_ssize_t(__pyx_v_self->len, __pyx_v_itemsize);\n      __pyx_t_9 = __pyx_t_1;\n      for (__pyx_t_11 = 0; __pyx_t_11 < __pyx_t_9; __pyx_t_11+=1) {\n        __pyx_v_i = __pyx_t_11;\n\n        /* \"View.MemoryView\":181\n *                 p = <PyObject **> self.data\n *                 for i in range(self.len / itemsize):\n *                     p[i] = Py_None             # <<<<<<<<<<<<<<\n *                     Py_INCREF(Py_None)\n * \n */\n        (__pyx_v_p[__pyx_v_i]) = Py_None;\n\n        /* \"View.MemoryView\":182\n *                 for i in range(self.len / itemsize):\n *                     p[i] = Py_None\n *                     Py_INCREF(Py_None)             # <<<<<<<<<<<<<<\n * \n *     @cname('getbuffer')\n */\n        Py_INCREF(Py_None);\n      }\n\n      /* \"View.MemoryView\":178\n *                 raise MemoryError(\"unable to allocate array data.\")\n * \n *             if self.dtype_is_object:             # <<<<<<<<<<<<<<\n *                 p = <PyObject **> self.data\n *                 for i in range(self.len / itemsize):\n */\n    }\n\n    /* \"View.MemoryView\":171\n *         self.free_data = allocate_buffer\n *         self.dtype_is_object = format == b'O'\n *         if allocate_buffer:             # <<<<<<<<<<<<<<\n * \n * \n */\n  }\n\n  /* \"View.MemoryView\":122\n *         cdef bint dtype_is_object\n * \n *     def __cinit__(array self, tuple shape, Py_ssize_t itemsize, format not None,             # <<<<<<<<<<<<<<\n *                   mode=\"c\", bint allocate_buffer=True):\n * \n */\n\n  /* function exit code */\n  __pyx_r = 0;\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_XDECREF(__pyx_t_6);\n  __Pyx_XDECREF(__pyx_t_10);\n  __Pyx_AddTraceback(\"View.MemoryView.array.__cinit__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = -1;\n  __pyx_L0:;\n  __Pyx_XDECREF(__pyx_v_format);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":185\n * \n *     @cname('getbuffer')\n *     def __getbuffer__(self, Py_buffer *info, int flags):             # <<<<<<<<<<<<<<\n *         cdef int bufmode = -1\n *         if self.mode == u\"c\":\n */\n\n/* Python wrapper */\nstatic CYTHON_UNUSED int __pyx_array_getbuffer(PyObject *__pyx_v_self, Py_buffer *__pyx_v_info, int __pyx_v_flags); /*proto*/\nstatic CYTHON_UNUSED int __pyx_array_getbuffer(PyObject *__pyx_v_self, Py_buffer *__pyx_v_info, int __pyx_v_flags) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__getbuffer__ (wrapper)\", 0);\n  __pyx_r = __pyx_array___pyx_pf_15View_dot_MemoryView_5array_2__getbuffer__(((struct __pyx_array_obj *)__pyx_v_self), ((Py_buffer *)__pyx_v_info), ((int)__pyx_v_flags));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic int __pyx_array___pyx_pf_15View_dot_MemoryView_5array_2__getbuffer__(struct __pyx_array_obj *__pyx_v_self, Py_buffer *__pyx_v_info, int __pyx_v_flags) {\n  int __pyx_v_bufmode;\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  int __pyx_t_2;\n  PyObject *__pyx_t_3 = NULL;\n  char *__pyx_t_4;\n  Py_ssize_t __pyx_t_5;\n  int __pyx_t_6;\n  Py_ssize_t *__pyx_t_7;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  if (__pyx_v_info == NULL) {\n    PyErr_SetString(PyExc_BufferError, \"PyObject_GetBuffer: view==NULL argument is obsolete\");\n    return -1;\n  }\n  __Pyx_RefNannySetupContext(\"__getbuffer__\", 0);\n  __pyx_v_info->obj = Py_None; __Pyx_INCREF(Py_None);\n  __Pyx_GIVEREF(__pyx_v_info->obj);\n\n  /* \"View.MemoryView\":186\n *     @cname('getbuffer')\n *     def __getbuffer__(self, Py_buffer *info, int flags):\n *         cdef int bufmode = -1             # <<<<<<<<<<<<<<\n *         if self.mode == u\"c\":\n *             bufmode = PyBUF_C_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS\n */\n  __pyx_v_bufmode = -1;\n\n  /* \"View.MemoryView\":187\n *     def __getbuffer__(self, Py_buffer *info, int flags):\n *         cdef int bufmode = -1\n *         if self.mode == u\"c\":             # <<<<<<<<<<<<<<\n *             bufmode = PyBUF_C_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS\n *         elif self.mode == u\"fortran\":\n */\n  __pyx_t_1 = (__Pyx_PyUnicode_Equals(__pyx_v_self->mode, __pyx_n_u_c, Py_EQ)); if (unlikely(__pyx_t_1 < 0)) __PYX_ERR(2, 187, __pyx_L1_error)\n  __pyx_t_2 = (__pyx_t_1 != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":188\n *         cdef int bufmode = -1\n *         if self.mode == u\"c\":\n *             bufmode = PyBUF_C_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS             # <<<<<<<<<<<<<<\n *         elif self.mode == u\"fortran\":\n *             bufmode = PyBUF_F_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS\n */\n    __pyx_v_bufmode = (PyBUF_C_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS);\n\n    /* \"View.MemoryView\":187\n *     def __getbuffer__(self, Py_buffer *info, int flags):\n *         cdef int bufmode = -1\n *         if self.mode == u\"c\":             # <<<<<<<<<<<<<<\n *             bufmode = PyBUF_C_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS\n *         elif self.mode == u\"fortran\":\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":189\n *         if self.mode == u\"c\":\n *             bufmode = PyBUF_C_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS\n *         elif self.mode == u\"fortran\":             # <<<<<<<<<<<<<<\n *             bufmode = PyBUF_F_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS\n *         if not (flags & bufmode):\n */\n  __pyx_t_2 = (__Pyx_PyUnicode_Equals(__pyx_v_self->mode, __pyx_n_u_fortran, Py_EQ)); if (unlikely(__pyx_t_2 < 0)) __PYX_ERR(2, 189, __pyx_L1_error)\n  __pyx_t_1 = (__pyx_t_2 != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":190\n *             bufmode = PyBUF_C_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS\n *         elif self.mode == u\"fortran\":\n *             bufmode = PyBUF_F_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS             # <<<<<<<<<<<<<<\n *         if not (flags & bufmode):\n *             raise ValueError(\"Can only create a buffer that is contiguous in memory.\")\n */\n    __pyx_v_bufmode = (PyBUF_F_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS);\n\n    /* \"View.MemoryView\":189\n *         if self.mode == u\"c\":\n *             bufmode = PyBUF_C_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS\n *         elif self.mode == u\"fortran\":             # <<<<<<<<<<<<<<\n *             bufmode = PyBUF_F_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS\n *         if not (flags & bufmode):\n */\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":191\n *         elif self.mode == u\"fortran\":\n *             bufmode = PyBUF_F_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS\n *         if not (flags & bufmode):             # <<<<<<<<<<<<<<\n *             raise ValueError(\"Can only create a buffer that is contiguous in memory.\")\n *         info.buf = self.data\n */\n  __pyx_t_1 = ((!((__pyx_v_flags & __pyx_v_bufmode) != 0)) != 0);\n  if (unlikely(__pyx_t_1)) {\n\n    /* \"View.MemoryView\":192\n *             bufmode = PyBUF_F_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS\n *         if not (flags & bufmode):\n *             raise ValueError(\"Can only create a buffer that is contiguous in memory.\")             # <<<<<<<<<<<<<<\n *         info.buf = self.data\n *         info.len = self.len\n */\n    __pyx_t_3 = __Pyx_PyObject_Call(__pyx_builtin_ValueError, __pyx_tuple__8, NULL); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 192, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_Raise(__pyx_t_3, 0, 0, 0);\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n    __PYX_ERR(2, 192, __pyx_L1_error)\n\n    /* \"View.MemoryView\":191\n *         elif self.mode == u\"fortran\":\n *             bufmode = PyBUF_F_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS\n *         if not (flags & bufmode):             # <<<<<<<<<<<<<<\n *             raise ValueError(\"Can only create a buffer that is contiguous in memory.\")\n *         info.buf = self.data\n */\n  }\n\n  /* \"View.MemoryView\":193\n *         if not (flags & bufmode):\n *             raise ValueError(\"Can only create a buffer that is contiguous in memory.\")\n *         info.buf = self.data             # <<<<<<<<<<<<<<\n *         info.len = self.len\n *         info.ndim = self.ndim\n */\n  __pyx_t_4 = __pyx_v_self->data;\n  __pyx_v_info->buf = __pyx_t_4;\n\n  /* \"View.MemoryView\":194\n *             raise ValueError(\"Can only create a buffer that is contiguous in memory.\")\n *         info.buf = self.data\n *         info.len = self.len             # <<<<<<<<<<<<<<\n *         info.ndim = self.ndim\n *         info.shape = self._shape\n */\n  __pyx_t_5 = __pyx_v_self->len;\n  __pyx_v_info->len = __pyx_t_5;\n\n  /* \"View.MemoryView\":195\n *         info.buf = self.data\n *         info.len = self.len\n *         info.ndim = self.ndim             # <<<<<<<<<<<<<<\n *         info.shape = self._shape\n *         info.strides = self._strides\n */\n  __pyx_t_6 = __pyx_v_self->ndim;\n  __pyx_v_info->ndim = __pyx_t_6;\n\n  /* \"View.MemoryView\":196\n *         info.len = self.len\n *         info.ndim = self.ndim\n *         info.shape = self._shape             # <<<<<<<<<<<<<<\n *         info.strides = self._strides\n *         info.suboffsets = NULL\n */\n  __pyx_t_7 = __pyx_v_self->_shape;\n  __pyx_v_info->shape = __pyx_t_7;\n\n  /* \"View.MemoryView\":197\n *         info.ndim = self.ndim\n *         info.shape = self._shape\n *         info.strides = self._strides             # <<<<<<<<<<<<<<\n *         info.suboffsets = NULL\n *         info.itemsize = self.itemsize\n */\n  __pyx_t_7 = __pyx_v_self->_strides;\n  __pyx_v_info->strides = __pyx_t_7;\n\n  /* \"View.MemoryView\":198\n *         info.shape = self._shape\n *         info.strides = self._strides\n *         info.suboffsets = NULL             # <<<<<<<<<<<<<<\n *         info.itemsize = self.itemsize\n *         info.readonly = 0\n */\n  __pyx_v_info->suboffsets = NULL;\n\n  /* \"View.MemoryView\":199\n *         info.strides = self._strides\n *         info.suboffsets = NULL\n *         info.itemsize = self.itemsize             # <<<<<<<<<<<<<<\n *         info.readonly = 0\n * \n */\n  __pyx_t_5 = __pyx_v_self->itemsize;\n  __pyx_v_info->itemsize = __pyx_t_5;\n\n  /* \"View.MemoryView\":200\n *         info.suboffsets = NULL\n *         info.itemsize = self.itemsize\n *         info.readonly = 0             # <<<<<<<<<<<<<<\n * \n *         if flags & PyBUF_FORMAT:\n */\n  __pyx_v_info->readonly = 0;\n\n  /* \"View.MemoryView\":202\n *         info.readonly = 0\n * \n *         if flags & PyBUF_FORMAT:             # <<<<<<<<<<<<<<\n *             info.format = self.format\n *         else:\n */\n  __pyx_t_1 = ((__pyx_v_flags & PyBUF_FORMAT) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":203\n * \n *         if flags & PyBUF_FORMAT:\n *             info.format = self.format             # <<<<<<<<<<<<<<\n *         else:\n *             info.format = NULL\n */\n    __pyx_t_4 = __pyx_v_self->format;\n    __pyx_v_info->format = __pyx_t_4;\n\n    /* \"View.MemoryView\":202\n *         info.readonly = 0\n * \n *         if flags & PyBUF_FORMAT:             # <<<<<<<<<<<<<<\n *             info.format = self.format\n *         else:\n */\n    goto __pyx_L5;\n  }\n\n  /* \"View.MemoryView\":205\n *             info.format = self.format\n *         else:\n *             info.format = NULL             # <<<<<<<<<<<<<<\n * \n *         info.obj = self\n */\n  /*else*/ {\n    __pyx_v_info->format = NULL;\n  }\n  __pyx_L5:;\n\n  /* \"View.MemoryView\":207\n *             info.format = NULL\n * \n *         info.obj = self             # <<<<<<<<<<<<<<\n * \n *     __pyx_getbuffer = capsule(<void *> &__pyx_array_getbuffer, \"getbuffer(obj, view, flags)\")\n */\n  __Pyx_INCREF(((PyObject *)__pyx_v_self));\n  __Pyx_GIVEREF(((PyObject *)__pyx_v_self));\n  __Pyx_GOTREF(__pyx_v_info->obj);\n  __Pyx_DECREF(__pyx_v_info->obj);\n  __pyx_v_info->obj = ((PyObject *)__pyx_v_self);\n\n  /* \"View.MemoryView\":185\n * \n *     @cname('getbuffer')\n *     def __getbuffer__(self, Py_buffer *info, int flags):             # <<<<<<<<<<<<<<\n *         cdef int bufmode = -1\n *         if self.mode == u\"c\":\n */\n\n  /* function exit code */\n  __pyx_r = 0;\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_AddTraceback(\"View.MemoryView.array.__getbuffer__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = -1;\n  if (__pyx_v_info->obj != NULL) {\n    __Pyx_GOTREF(__pyx_v_info->obj);\n    __Pyx_DECREF(__pyx_v_info->obj); __pyx_v_info->obj = 0;\n  }\n  goto __pyx_L2;\n  __pyx_L0:;\n  if (__pyx_v_info->obj == Py_None) {\n    __Pyx_GOTREF(__pyx_v_info->obj);\n    __Pyx_DECREF(__pyx_v_info->obj); __pyx_v_info->obj = 0;\n  }\n  __pyx_L2:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":211\n *     __pyx_getbuffer = capsule(<void *> &__pyx_array_getbuffer, \"getbuffer(obj, view, flags)\")\n * \n *     def __dealloc__(array self):             # <<<<<<<<<<<<<<\n *         if self.callback_free_data != NULL:\n *             self.callback_free_data(self.data)\n */\n\n/* Python wrapper */\nstatic void __pyx_array___dealloc__(PyObject *__pyx_v_self); /*proto*/\nstatic void __pyx_array___dealloc__(PyObject *__pyx_v_self) {\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__dealloc__ (wrapper)\", 0);\n  __pyx_array___pyx_pf_15View_dot_MemoryView_5array_4__dealloc__(((struct __pyx_array_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n}\n\nstatic void __pyx_array___pyx_pf_15View_dot_MemoryView_5array_4__dealloc__(struct __pyx_array_obj *__pyx_v_self) {\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  __Pyx_RefNannySetupContext(\"__dealloc__\", 0);\n\n  /* \"View.MemoryView\":212\n * \n *     def __dealloc__(array self):\n *         if self.callback_free_data != NULL:             # <<<<<<<<<<<<<<\n *             self.callback_free_data(self.data)\n *         elif self.free_data:\n */\n  __pyx_t_1 = ((__pyx_v_self->callback_free_data != NULL) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":213\n *     def __dealloc__(array self):\n *         if self.callback_free_data != NULL:\n *             self.callback_free_data(self.data)             # <<<<<<<<<<<<<<\n *         elif self.free_data:\n *             if self.dtype_is_object:\n */\n    __pyx_v_self->callback_free_data(__pyx_v_self->data);\n\n    /* \"View.MemoryView\":212\n * \n *     def __dealloc__(array self):\n *         if self.callback_free_data != NULL:             # <<<<<<<<<<<<<<\n *             self.callback_free_data(self.data)\n *         elif self.free_data:\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":214\n *         if self.callback_free_data != NULL:\n *             self.callback_free_data(self.data)\n *         elif self.free_data:             # <<<<<<<<<<<<<<\n *             if self.dtype_is_object:\n *                 refcount_objects_in_slice(self.data, self._shape,\n */\n  __pyx_t_1 = (__pyx_v_self->free_data != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":215\n *             self.callback_free_data(self.data)\n *         elif self.free_data:\n *             if self.dtype_is_object:             # <<<<<<<<<<<<<<\n *                 refcount_objects_in_slice(self.data, self._shape,\n *                                           self._strides, self.ndim, False)\n */\n    __pyx_t_1 = (__pyx_v_self->dtype_is_object != 0);\n    if (__pyx_t_1) {\n\n      /* \"View.MemoryView\":216\n *         elif self.free_data:\n *             if self.dtype_is_object:\n *                 refcount_objects_in_slice(self.data, self._shape,             # <<<<<<<<<<<<<<\n *                                           self._strides, self.ndim, False)\n *             free(self.data)\n */\n      __pyx_memoryview_refcount_objects_in_slice(__pyx_v_self->data, __pyx_v_self->_shape, __pyx_v_self->_strides, __pyx_v_self->ndim, 0);\n\n      /* \"View.MemoryView\":215\n *             self.callback_free_data(self.data)\n *         elif self.free_data:\n *             if self.dtype_is_object:             # <<<<<<<<<<<<<<\n *                 refcount_objects_in_slice(self.data, self._shape,\n *                                           self._strides, self.ndim, False)\n */\n    }\n\n    /* \"View.MemoryView\":218\n *                 refcount_objects_in_slice(self.data, self._shape,\n *                                           self._strides, self.ndim, False)\n *             free(self.data)             # <<<<<<<<<<<<<<\n *         PyObject_Free(self._shape)\n * \n */\n    free(__pyx_v_self->data);\n\n    /* \"View.MemoryView\":214\n *         if self.callback_free_data != NULL:\n *             self.callback_free_data(self.data)\n *         elif self.free_data:             # <<<<<<<<<<<<<<\n *             if self.dtype_is_object:\n *                 refcount_objects_in_slice(self.data, self._shape,\n */\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":219\n *                                           self._strides, self.ndim, False)\n *             free(self.data)\n *         PyObject_Free(self._shape)             # <<<<<<<<<<<<<<\n * \n *     @property\n */\n  PyObject_Free(__pyx_v_self->_shape);\n\n  /* \"View.MemoryView\":211\n *     __pyx_getbuffer = capsule(<void *> &__pyx_array_getbuffer, \"getbuffer(obj, view, flags)\")\n * \n *     def __dealloc__(array self):             # <<<<<<<<<<<<<<\n *         if self.callback_free_data != NULL:\n *             self.callback_free_data(self.data)\n */\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n}\n\n/* \"View.MemoryView\":222\n * \n *     @property\n *     def memview(self):             # <<<<<<<<<<<<<<\n *         return self.get_memview()\n * \n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_5array_7memview_1__get__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_5array_7memview_1__get__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__get__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf_15View_dot_MemoryView_5array_7memview___get__(((struct __pyx_array_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_5array_7memview___get__(struct __pyx_array_obj *__pyx_v_self) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__get__\", 0);\n\n  /* \"View.MemoryView\":223\n *     @property\n *     def memview(self):\n *         return self.get_memview()             # <<<<<<<<<<<<<<\n * \n *     @cname('get_memview')\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = ((struct __pyx_vtabstruct_array *)__pyx_v_self->__pyx_vtab)->get_memview(__pyx_v_self); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 223, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_r = __pyx_t_1;\n  __pyx_t_1 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":222\n * \n *     @property\n *     def memview(self):             # <<<<<<<<<<<<<<\n *         return self.get_memview()\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"View.MemoryView.array.memview.__get__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":226\n * \n *     @cname('get_memview')\n *     cdef get_memview(self):             # <<<<<<<<<<<<<<\n *         flags =  PyBUF_ANY_CONTIGUOUS|PyBUF_FORMAT|PyBUF_WRITABLE\n *         return  memoryview(self, flags, self.dtype_is_object)\n */\n\nstatic PyObject *__pyx_array_get_memview(struct __pyx_array_obj *__pyx_v_self) {\n  int __pyx_v_flags;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"get_memview\", 0);\n\n  /* \"View.MemoryView\":227\n *     @cname('get_memview')\n *     cdef get_memview(self):\n *         flags =  PyBUF_ANY_CONTIGUOUS|PyBUF_FORMAT|PyBUF_WRITABLE             # <<<<<<<<<<<<<<\n *         return  memoryview(self, flags, self.dtype_is_object)\n * \n */\n  __pyx_v_flags = ((PyBUF_ANY_CONTIGUOUS | PyBUF_FORMAT) | PyBUF_WRITABLE);\n\n  /* \"View.MemoryView\":228\n *     cdef get_memview(self):\n *         flags =  PyBUF_ANY_CONTIGUOUS|PyBUF_FORMAT|PyBUF_WRITABLE\n *         return  memoryview(self, flags, self.dtype_is_object)             # <<<<<<<<<<<<<<\n * \n *     def __len__(self):\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = __Pyx_PyInt_From_int(__pyx_v_flags); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 228, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_t_2 = __Pyx_PyBool_FromLong(__pyx_v_self->dtype_is_object); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 228, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __pyx_t_3 = PyTuple_New(3); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 228, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  __Pyx_INCREF(((PyObject *)__pyx_v_self));\n  __Pyx_GIVEREF(((PyObject *)__pyx_v_self));\n  PyTuple_SET_ITEM(__pyx_t_3, 0, ((PyObject *)__pyx_v_self));\n  __Pyx_GIVEREF(__pyx_t_1);\n  PyTuple_SET_ITEM(__pyx_t_3, 1, __pyx_t_1);\n  __Pyx_GIVEREF(__pyx_t_2);\n  PyTuple_SET_ITEM(__pyx_t_3, 2, __pyx_t_2);\n  __pyx_t_1 = 0;\n  __pyx_t_2 = 0;\n  __pyx_t_2 = __Pyx_PyObject_Call(((PyObject *)__pyx_memoryview_type), __pyx_t_3, NULL); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 228, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n  __pyx_r = __pyx_t_2;\n  __pyx_t_2 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":226\n * \n *     @cname('get_memview')\n *     cdef get_memview(self):             # <<<<<<<<<<<<<<\n *         flags =  PyBUF_ANY_CONTIGUOUS|PyBUF_FORMAT|PyBUF_WRITABLE\n *         return  memoryview(self, flags, self.dtype_is_object)\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_AddTraceback(\"View.MemoryView.array.get_memview\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":230\n *         return  memoryview(self, flags, self.dtype_is_object)\n * \n *     def __len__(self):             # <<<<<<<<<<<<<<\n *         return self._shape[0]\n * \n */\n\n/* Python wrapper */\nstatic Py_ssize_t __pyx_array___len__(PyObject *__pyx_v_self); /*proto*/\nstatic Py_ssize_t __pyx_array___len__(PyObject *__pyx_v_self) {\n  Py_ssize_t __pyx_r;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__len__ (wrapper)\", 0);\n  __pyx_r = __pyx_array___pyx_pf_15View_dot_MemoryView_5array_6__len__(((struct __pyx_array_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic Py_ssize_t __pyx_array___pyx_pf_15View_dot_MemoryView_5array_6__len__(struct __pyx_array_obj *__pyx_v_self) {\n  Py_ssize_t __pyx_r;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__len__\", 0);\n\n  /* \"View.MemoryView\":231\n * \n *     def __len__(self):\n *         return self._shape[0]             # <<<<<<<<<<<<<<\n * \n *     def __getattr__(self, attr):\n */\n  __pyx_r = (__pyx_v_self->_shape[0]);\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":230\n *         return  memoryview(self, flags, self.dtype_is_object)\n * \n *     def __len__(self):             # <<<<<<<<<<<<<<\n *         return self._shape[0]\n * \n */\n\n  /* function exit code */\n  __pyx_L0:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":233\n *         return self._shape[0]\n * \n *     def __getattr__(self, attr):             # <<<<<<<<<<<<<<\n *         return getattr(self.memview, attr)\n * \n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_array___getattr__(PyObject *__pyx_v_self, PyObject *__pyx_v_attr); /*proto*/\nstatic PyObject *__pyx_array___getattr__(PyObject *__pyx_v_self, PyObject *__pyx_v_attr) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__getattr__ (wrapper)\", 0);\n  __pyx_r = __pyx_array___pyx_pf_15View_dot_MemoryView_5array_8__getattr__(((struct __pyx_array_obj *)__pyx_v_self), ((PyObject *)__pyx_v_attr));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_array___pyx_pf_15View_dot_MemoryView_5array_8__getattr__(struct __pyx_array_obj *__pyx_v_self, PyObject *__pyx_v_attr) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  PyObject *__pyx_t_2 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__getattr__\", 0);\n\n  /* \"View.MemoryView\":234\n * \n *     def __getattr__(self, attr):\n *         return getattr(self.memview, attr)             # <<<<<<<<<<<<<<\n * \n *     def __getitem__(self, item):\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = __Pyx_PyObject_GetAttrStr(((PyObject *)__pyx_v_self), __pyx_n_s_memview); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 234, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_t_2 = __Pyx_GetAttr(__pyx_t_1, __pyx_v_attr); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 234, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __pyx_r = __pyx_t_2;\n  __pyx_t_2 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":233\n *         return self._shape[0]\n * \n *     def __getattr__(self, attr):             # <<<<<<<<<<<<<<\n *         return getattr(self.memview, attr)\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_AddTraceback(\"View.MemoryView.array.__getattr__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":236\n *         return getattr(self.memview, attr)\n * \n *     def __getitem__(self, item):             # <<<<<<<<<<<<<<\n *         return self.memview[item]\n * \n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_array___getitem__(PyObject *__pyx_v_self, PyObject *__pyx_v_item); /*proto*/\nstatic PyObject *__pyx_array___getitem__(PyObject *__pyx_v_self, PyObject *__pyx_v_item) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__getitem__ (wrapper)\", 0);\n  __pyx_r = __pyx_array___pyx_pf_15View_dot_MemoryView_5array_10__getitem__(((struct __pyx_array_obj *)__pyx_v_self), ((PyObject *)__pyx_v_item));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_array___pyx_pf_15View_dot_MemoryView_5array_10__getitem__(struct __pyx_array_obj *__pyx_v_self, PyObject *__pyx_v_item) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  PyObject *__pyx_t_2 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__getitem__\", 0);\n\n  /* \"View.MemoryView\":237\n * \n *     def __getitem__(self, item):\n *         return self.memview[item]             # <<<<<<<<<<<<<<\n * \n *     def __setitem__(self, item, value):\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = __Pyx_PyObject_GetAttrStr(((PyObject *)__pyx_v_self), __pyx_n_s_memview); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 237, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_t_2 = __Pyx_PyObject_GetItem(__pyx_t_1, __pyx_v_item); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 237, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __pyx_r = __pyx_t_2;\n  __pyx_t_2 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":236\n *         return getattr(self.memview, attr)\n * \n *     def __getitem__(self, item):             # <<<<<<<<<<<<<<\n *         return self.memview[item]\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_AddTraceback(\"View.MemoryView.array.__getitem__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":239\n *         return self.memview[item]\n * \n *     def __setitem__(self, item, value):             # <<<<<<<<<<<<<<\n *         self.memview[item] = value\n * \n */\n\n/* Python wrapper */\nstatic int __pyx_array___setitem__(PyObject *__pyx_v_self, PyObject *__pyx_v_item, PyObject *__pyx_v_value); /*proto*/\nstatic int __pyx_array___setitem__(PyObject *__pyx_v_self, PyObject *__pyx_v_item, PyObject *__pyx_v_value) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__setitem__ (wrapper)\", 0);\n  __pyx_r = __pyx_array___pyx_pf_15View_dot_MemoryView_5array_12__setitem__(((struct __pyx_array_obj *)__pyx_v_self), ((PyObject *)__pyx_v_item), ((PyObject *)__pyx_v_value));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic int __pyx_array___pyx_pf_15View_dot_MemoryView_5array_12__setitem__(struct __pyx_array_obj *__pyx_v_self, PyObject *__pyx_v_item, PyObject *__pyx_v_value) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__setitem__\", 0);\n\n  /* \"View.MemoryView\":240\n * \n *     def __setitem__(self, item, value):\n *         self.memview[item] = value             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_t_1 = __Pyx_PyObject_GetAttrStr(((PyObject *)__pyx_v_self), __pyx_n_s_memview); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 240, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  if (unlikely(PyObject_SetItem(__pyx_t_1, __pyx_v_item, __pyx_v_value) < 0)) __PYX_ERR(2, 240, __pyx_L1_error)\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n\n  /* \"View.MemoryView\":239\n *         return self.memview[item]\n * \n *     def __setitem__(self, item, value):             # <<<<<<<<<<<<<<\n *         self.memview[item] = value\n * \n */\n\n  /* function exit code */\n  __pyx_r = 0;\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"View.MemoryView.array.__setitem__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = -1;\n  __pyx_L0:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"(tree fragment)\":1\n * def __reduce_cython__(self):             # <<<<<<<<<<<<<<\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw___pyx_array_1__reduce_cython__(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused); /*proto*/\nstatic PyObject *__pyx_pw___pyx_array_1__reduce_cython__(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__reduce_cython__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf___pyx_array___reduce_cython__(((struct __pyx_array_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf___pyx_array___reduce_cython__(CYTHON_UNUSED struct __pyx_array_obj *__pyx_v_self) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__reduce_cython__\", 0);\n\n  /* \"(tree fragment)\":2\n * def __reduce_cython__(self):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")             # <<<<<<<<<<<<<<\n * def __setstate_cython__(self, __pyx_state):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n */\n  __pyx_t_1 = __Pyx_PyObject_Call(__pyx_builtin_TypeError, __pyx_tuple__9, NULL); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 2, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_Raise(__pyx_t_1, 0, 0, 0);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __PYX_ERR(2, 2, __pyx_L1_error)\n\n  /* \"(tree fragment)\":1\n * def __reduce_cython__(self):             # <<<<<<<<<<<<<<\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"View.MemoryView.array.__reduce_cython__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"(tree fragment)\":3\n * def __reduce_cython__(self):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):             # <<<<<<<<<<<<<<\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw___pyx_array_3__setstate_cython__(PyObject *__pyx_v_self, PyObject *__pyx_v___pyx_state); /*proto*/\nstatic PyObject *__pyx_pw___pyx_array_3__setstate_cython__(PyObject *__pyx_v_self, PyObject *__pyx_v___pyx_state) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__setstate_cython__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf___pyx_array_2__setstate_cython__(((struct __pyx_array_obj *)__pyx_v_self), ((PyObject *)__pyx_v___pyx_state));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf___pyx_array_2__setstate_cython__(CYTHON_UNUSED struct __pyx_array_obj *__pyx_v_self, CYTHON_UNUSED PyObject *__pyx_v___pyx_state) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__setstate_cython__\", 0);\n\n  /* \"(tree fragment)\":4\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")             # <<<<<<<<<<<<<<\n */\n  __pyx_t_1 = __Pyx_PyObject_Call(__pyx_builtin_TypeError, __pyx_tuple__10, NULL); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 4, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_Raise(__pyx_t_1, 0, 0, 0);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __PYX_ERR(2, 4, __pyx_L1_error)\n\n  /* \"(tree fragment)\":3\n * def __reduce_cython__(self):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):             # <<<<<<<<<<<<<<\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"View.MemoryView.array.__setstate_cython__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":244\n * \n * @cname(\"__pyx_array_new\")\n * cdef array array_cwrapper(tuple shape, Py_ssize_t itemsize, char *format,             # <<<<<<<<<<<<<<\n *                           char *mode, char *buf):\n *     cdef array result\n */\n\nstatic struct __pyx_array_obj *__pyx_array_new(PyObject *__pyx_v_shape, Py_ssize_t __pyx_v_itemsize, char *__pyx_v_format, char *__pyx_v_mode, char *__pyx_v_buf) {\n  struct __pyx_array_obj *__pyx_v_result = 0;\n  struct __pyx_array_obj *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  PyObject *__pyx_t_4 = NULL;\n  PyObject *__pyx_t_5 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"array_cwrapper\", 0);\n\n  /* \"View.MemoryView\":248\n *     cdef array result\n * \n *     if buf == NULL:             # <<<<<<<<<<<<<<\n *         result = array(shape, itemsize, format, mode.decode('ASCII'))\n *     else:\n */\n  __pyx_t_1 = ((__pyx_v_buf == NULL) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":249\n * \n *     if buf == NULL:\n *         result = array(shape, itemsize, format, mode.decode('ASCII'))             # <<<<<<<<<<<<<<\n *     else:\n *         result = array(shape, itemsize, format, mode.decode('ASCII'),\n */\n    __pyx_t_2 = PyInt_FromSsize_t(__pyx_v_itemsize); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 249, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __pyx_t_3 = __Pyx_PyBytes_FromString(__pyx_v_format); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 249, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __pyx_t_4 = __Pyx_decode_c_string(__pyx_v_mode, 0, strlen(__pyx_v_mode), NULL, NULL, PyUnicode_DecodeASCII); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 249, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __pyx_t_5 = PyTuple_New(4); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 249, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_5);\n    __Pyx_INCREF(__pyx_v_shape);\n    __Pyx_GIVEREF(__pyx_v_shape);\n    PyTuple_SET_ITEM(__pyx_t_5, 0, __pyx_v_shape);\n    __Pyx_GIVEREF(__pyx_t_2);\n    PyTuple_SET_ITEM(__pyx_t_5, 1, __pyx_t_2);\n    __Pyx_GIVEREF(__pyx_t_3);\n    PyTuple_SET_ITEM(__pyx_t_5, 2, __pyx_t_3);\n    __Pyx_GIVEREF(__pyx_t_4);\n    PyTuple_SET_ITEM(__pyx_t_5, 3, __pyx_t_4);\n    __pyx_t_2 = 0;\n    __pyx_t_3 = 0;\n    __pyx_t_4 = 0;\n    __pyx_t_4 = __Pyx_PyObject_Call(((PyObject *)__pyx_array_type), __pyx_t_5, NULL); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 249, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0;\n    __pyx_v_result = ((struct __pyx_array_obj *)__pyx_t_4);\n    __pyx_t_4 = 0;\n\n    /* \"View.MemoryView\":248\n *     cdef array result\n * \n *     if buf == NULL:             # <<<<<<<<<<<<<<\n *         result = array(shape, itemsize, format, mode.decode('ASCII'))\n *     else:\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":251\n *         result = array(shape, itemsize, format, mode.decode('ASCII'))\n *     else:\n *         result = array(shape, itemsize, format, mode.decode('ASCII'),             # <<<<<<<<<<<<<<\n *                        allocate_buffer=False)\n *         result.data = buf\n */\n  /*else*/ {\n    __pyx_t_4 = PyInt_FromSsize_t(__pyx_v_itemsize); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 251, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __pyx_t_5 = __Pyx_PyBytes_FromString(__pyx_v_format); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 251, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_5);\n    __pyx_t_3 = __Pyx_decode_c_string(__pyx_v_mode, 0, strlen(__pyx_v_mode), NULL, NULL, PyUnicode_DecodeASCII); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 251, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __pyx_t_2 = PyTuple_New(4); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 251, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __Pyx_INCREF(__pyx_v_shape);\n    __Pyx_GIVEREF(__pyx_v_shape);\n    PyTuple_SET_ITEM(__pyx_t_2, 0, __pyx_v_shape);\n    __Pyx_GIVEREF(__pyx_t_4);\n    PyTuple_SET_ITEM(__pyx_t_2, 1, __pyx_t_4);\n    __Pyx_GIVEREF(__pyx_t_5);\n    PyTuple_SET_ITEM(__pyx_t_2, 2, __pyx_t_5);\n    __Pyx_GIVEREF(__pyx_t_3);\n    PyTuple_SET_ITEM(__pyx_t_2, 3, __pyx_t_3);\n    __pyx_t_4 = 0;\n    __pyx_t_5 = 0;\n    __pyx_t_3 = 0;\n\n    /* \"View.MemoryView\":252\n *     else:\n *         result = array(shape, itemsize, format, mode.decode('ASCII'),\n *                        allocate_buffer=False)             # <<<<<<<<<<<<<<\n *         result.data = buf\n * \n */\n    __pyx_t_3 = __Pyx_PyDict_NewPresized(1); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 252, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    if (PyDict_SetItem(__pyx_t_3, __pyx_n_s_allocate_buffer, Py_False) < 0) __PYX_ERR(2, 252, __pyx_L1_error)\n\n    /* \"View.MemoryView\":251\n *         result = array(shape, itemsize, format, mode.decode('ASCII'))\n *     else:\n *         result = array(shape, itemsize, format, mode.decode('ASCII'),             # <<<<<<<<<<<<<<\n *                        allocate_buffer=False)\n *         result.data = buf\n */\n    __pyx_t_5 = __Pyx_PyObject_Call(((PyObject *)__pyx_array_type), __pyx_t_2, __pyx_t_3); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 251, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_5);\n    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n    __pyx_v_result = ((struct __pyx_array_obj *)__pyx_t_5);\n    __pyx_t_5 = 0;\n\n    /* \"View.MemoryView\":253\n *         result = array(shape, itemsize, format, mode.decode('ASCII'),\n *                        allocate_buffer=False)\n *         result.data = buf             # <<<<<<<<<<<<<<\n * \n *     return result\n */\n    __pyx_v_result->data = __pyx_v_buf;\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":255\n *         result.data = buf\n * \n *     return result             # <<<<<<<<<<<<<<\n * \n * \n */\n  __Pyx_XDECREF(((PyObject *)__pyx_r));\n  __Pyx_INCREF(((PyObject *)__pyx_v_result));\n  __pyx_r = __pyx_v_result;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":244\n * \n * @cname(\"__pyx_array_new\")\n * cdef array array_cwrapper(tuple shape, Py_ssize_t itemsize, char *format,             # <<<<<<<<<<<<<<\n *                           char *mode, char *buf):\n *     cdef array result\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_XDECREF(__pyx_t_4);\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_AddTraceback(\"View.MemoryView.array_cwrapper\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XDECREF((PyObject *)__pyx_v_result);\n  __Pyx_XGIVEREF((PyObject *)__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":281\n * cdef class Enum(object):\n *     cdef object name\n *     def __init__(self, name):             # <<<<<<<<<<<<<<\n *         self.name = name\n *     def __repr__(self):\n */\n\n/* Python wrapper */\nstatic int __pyx_MemviewEnum___init__(PyObject *__pyx_v_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/\nstatic int __pyx_MemviewEnum___init__(PyObject *__pyx_v_self, PyObject *__pyx_args, PyObject *__pyx_kwds) {\n  PyObject *__pyx_v_name = 0;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__init__ (wrapper)\", 0);\n  {\n    static PyObject **__pyx_pyargnames[] = {&__pyx_n_s_name,0};\n    PyObject* values[1] = {0};\n    if (unlikely(__pyx_kwds)) {\n      Py_ssize_t kw_args;\n      const Py_ssize_t pos_args = PyTuple_GET_SIZE(__pyx_args);\n      switch (pos_args) {\n        case  1: values[0] = PyTuple_GET_ITEM(__pyx_args, 0);\n        CYTHON_FALLTHROUGH;\n        case  0: break;\n        default: goto __pyx_L5_argtuple_error;\n      }\n      kw_args = PyDict_Size(__pyx_kwds);\n      switch (pos_args) {\n        case  0:\n        if (likely((values[0] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_name)) != 0)) kw_args--;\n        else goto __pyx_L5_argtuple_error;\n      }\n      if (unlikely(kw_args > 0)) {\n        if (unlikely(__Pyx_ParseOptionalKeywords(__pyx_kwds, __pyx_pyargnames, 0, values, pos_args, \"__init__\") < 0)) __PYX_ERR(2, 281, __pyx_L3_error)\n      }\n    } else if (PyTuple_GET_SIZE(__pyx_args) != 1) {\n      goto __pyx_L5_argtuple_error;\n    } else {\n      values[0] = PyTuple_GET_ITEM(__pyx_args, 0);\n    }\n    __pyx_v_name = values[0];\n  }\n  goto __pyx_L4_argument_unpacking_done;\n  __pyx_L5_argtuple_error:;\n  __Pyx_RaiseArgtupleInvalid(\"__init__\", 1, 1, 1, PyTuple_GET_SIZE(__pyx_args)); __PYX_ERR(2, 281, __pyx_L3_error)\n  __pyx_L3_error:;\n  __Pyx_AddTraceback(\"View.MemoryView.Enum.__init__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __Pyx_RefNannyFinishContext();\n  return -1;\n  __pyx_L4_argument_unpacking_done:;\n  __pyx_r = __pyx_MemviewEnum___pyx_pf_15View_dot_MemoryView_4Enum___init__(((struct __pyx_MemviewEnum_obj *)__pyx_v_self), __pyx_v_name);\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic int __pyx_MemviewEnum___pyx_pf_15View_dot_MemoryView_4Enum___init__(struct __pyx_MemviewEnum_obj *__pyx_v_self, PyObject *__pyx_v_name) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__init__\", 0);\n\n  /* \"View.MemoryView\":282\n *     cdef object name\n *     def __init__(self, name):\n *         self.name = name             # <<<<<<<<<<<<<<\n *     def __repr__(self):\n *         return self.name\n */\n  __Pyx_INCREF(__pyx_v_name);\n  __Pyx_GIVEREF(__pyx_v_name);\n  __Pyx_GOTREF(__pyx_v_self->name);\n  __Pyx_DECREF(__pyx_v_self->name);\n  __pyx_v_self->name = __pyx_v_name;\n\n  /* \"View.MemoryView\":281\n * cdef class Enum(object):\n *     cdef object name\n *     def __init__(self, name):             # <<<<<<<<<<<<<<\n *         self.name = name\n *     def __repr__(self):\n */\n\n  /* function exit code */\n  __pyx_r = 0;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":283\n *     def __init__(self, name):\n *         self.name = name\n *     def __repr__(self):             # <<<<<<<<<<<<<<\n *         return self.name\n * \n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_MemviewEnum___repr__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_MemviewEnum___repr__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__repr__ (wrapper)\", 0);\n  __pyx_r = __pyx_MemviewEnum___pyx_pf_15View_dot_MemoryView_4Enum_2__repr__(((struct __pyx_MemviewEnum_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_MemviewEnum___pyx_pf_15View_dot_MemoryView_4Enum_2__repr__(struct __pyx_MemviewEnum_obj *__pyx_v_self) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__repr__\", 0);\n\n  /* \"View.MemoryView\":284\n *         self.name = name\n *     def __repr__(self):\n *         return self.name             # <<<<<<<<<<<<<<\n * \n * cdef generic = Enum(\"<strided and direct or indirect>\")\n */\n  __Pyx_XDECREF(__pyx_r);\n  __Pyx_INCREF(__pyx_v_self->name);\n  __pyx_r = __pyx_v_self->name;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":283\n *     def __init__(self, name):\n *         self.name = name\n *     def __repr__(self):             # <<<<<<<<<<<<<<\n *         return self.name\n * \n */\n\n  /* function exit code */\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"(tree fragment)\":1\n * def __reduce_cython__(self):             # <<<<<<<<<<<<<<\n *     cdef tuple state\n *     cdef object _dict\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw___pyx_MemviewEnum_1__reduce_cython__(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused); /*proto*/\nstatic PyObject *__pyx_pw___pyx_MemviewEnum_1__reduce_cython__(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__reduce_cython__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf___pyx_MemviewEnum___reduce_cython__(((struct __pyx_MemviewEnum_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf___pyx_MemviewEnum___reduce_cython__(struct __pyx_MemviewEnum_obj *__pyx_v_self) {\n  PyObject *__pyx_v_state = 0;\n  PyObject *__pyx_v__dict = 0;\n  int __pyx_v_use_setstate;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_t_2;\n  int __pyx_t_3;\n  PyObject *__pyx_t_4 = NULL;\n  PyObject *__pyx_t_5 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__reduce_cython__\", 0);\n\n  /* \"(tree fragment)\":5\n *     cdef object _dict\n *     cdef bint use_setstate\n *     state = (self.name,)             # <<<<<<<<<<<<<<\n *     _dict = getattr(self, '__dict__', None)\n *     if _dict is not None:\n */\n  __pyx_t_1 = PyTuple_New(1); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 5, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_INCREF(__pyx_v_self->name);\n  __Pyx_GIVEREF(__pyx_v_self->name);\n  PyTuple_SET_ITEM(__pyx_t_1, 0, __pyx_v_self->name);\n  __pyx_v_state = ((PyObject*)__pyx_t_1);\n  __pyx_t_1 = 0;\n\n  /* \"(tree fragment)\":6\n *     cdef bint use_setstate\n *     state = (self.name,)\n *     _dict = getattr(self, '__dict__', None)             # <<<<<<<<<<<<<<\n *     if _dict is not None:\n *         state += (_dict,)\n */\n  __pyx_t_1 = __Pyx_GetAttr3(((PyObject *)__pyx_v_self), __pyx_n_s_dict, Py_None); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 6, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_v__dict = __pyx_t_1;\n  __pyx_t_1 = 0;\n\n  /* \"(tree fragment)\":7\n *     state = (self.name,)\n *     _dict = getattr(self, '__dict__', None)\n *     if _dict is not None:             # <<<<<<<<<<<<<<\n *         state += (_dict,)\n *         use_setstate = True\n */\n  __pyx_t_2 = (__pyx_v__dict != Py_None);\n  __pyx_t_3 = (__pyx_t_2 != 0);\n  if (__pyx_t_3) {\n\n    /* \"(tree fragment)\":8\n *     _dict = getattr(self, '__dict__', None)\n *     if _dict is not None:\n *         state += (_dict,)             # <<<<<<<<<<<<<<\n *         use_setstate = True\n *     else:\n */\n    __pyx_t_1 = PyTuple_New(1); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 8, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_1);\n    __Pyx_INCREF(__pyx_v__dict);\n    __Pyx_GIVEREF(__pyx_v__dict);\n    PyTuple_SET_ITEM(__pyx_t_1, 0, __pyx_v__dict);\n    __pyx_t_4 = PyNumber_InPlaceAdd(__pyx_v_state, __pyx_t_1); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 8, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n    __Pyx_DECREF_SET(__pyx_v_state, ((PyObject*)__pyx_t_4));\n    __pyx_t_4 = 0;\n\n    /* \"(tree fragment)\":9\n *     if _dict is not None:\n *         state += (_dict,)\n *         use_setstate = True             # <<<<<<<<<<<<<<\n *     else:\n *         use_setstate = self.name is not None\n */\n    __pyx_v_use_setstate = 1;\n\n    /* \"(tree fragment)\":7\n *     state = (self.name,)\n *     _dict = getattr(self, '__dict__', None)\n *     if _dict is not None:             # <<<<<<<<<<<<<<\n *         state += (_dict,)\n *         use_setstate = True\n */\n    goto __pyx_L3;\n  }\n\n  /* \"(tree fragment)\":11\n *         use_setstate = True\n *     else:\n *         use_setstate = self.name is not None             # <<<<<<<<<<<<<<\n *     if use_setstate:\n *         return __pyx_unpickle_Enum, (type(self), 0xb068931, None), state\n */\n  /*else*/ {\n    __pyx_t_3 = (__pyx_v_self->name != Py_None);\n    __pyx_v_use_setstate = __pyx_t_3;\n  }\n  __pyx_L3:;\n\n  /* \"(tree fragment)\":12\n *     else:\n *         use_setstate = self.name is not None\n *     if use_setstate:             # <<<<<<<<<<<<<<\n *         return __pyx_unpickle_Enum, (type(self), 0xb068931, None), state\n *     else:\n */\n  __pyx_t_3 = (__pyx_v_use_setstate != 0);\n  if (__pyx_t_3) {\n\n    /* \"(tree fragment)\":13\n *         use_setstate = self.name is not None\n *     if use_setstate:\n *         return __pyx_unpickle_Enum, (type(self), 0xb068931, None), state             # <<<<<<<<<<<<<<\n *     else:\n *         return __pyx_unpickle_Enum, (type(self), 0xb068931, state)\n */\n    __Pyx_XDECREF(__pyx_r);\n    __Pyx_GetModuleGlobalName(__pyx_t_4, __pyx_n_s_pyx_unpickle_Enum); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 13, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __pyx_t_1 = PyTuple_New(3); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 13, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_1);\n    __Pyx_INCREF(((PyObject *)Py_TYPE(((PyObject *)__pyx_v_self))));\n    __Pyx_GIVEREF(((PyObject *)Py_TYPE(((PyObject *)__pyx_v_self))));\n    PyTuple_SET_ITEM(__pyx_t_1, 0, ((PyObject *)Py_TYPE(((PyObject *)__pyx_v_self))));\n    __Pyx_INCREF(__pyx_int_184977713);\n    __Pyx_GIVEREF(__pyx_int_184977713);\n    PyTuple_SET_ITEM(__pyx_t_1, 1, __pyx_int_184977713);\n    __Pyx_INCREF(Py_None);\n    __Pyx_GIVEREF(Py_None);\n    PyTuple_SET_ITEM(__pyx_t_1, 2, Py_None);\n    __pyx_t_5 = PyTuple_New(3); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 13, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_5);\n    __Pyx_GIVEREF(__pyx_t_4);\n    PyTuple_SET_ITEM(__pyx_t_5, 0, __pyx_t_4);\n    __Pyx_GIVEREF(__pyx_t_1);\n    PyTuple_SET_ITEM(__pyx_t_5, 1, __pyx_t_1);\n    __Pyx_INCREF(__pyx_v_state);\n    __Pyx_GIVEREF(__pyx_v_state);\n    PyTuple_SET_ITEM(__pyx_t_5, 2, __pyx_v_state);\n    __pyx_t_4 = 0;\n    __pyx_t_1 = 0;\n    __pyx_r = __pyx_t_5;\n    __pyx_t_5 = 0;\n    goto __pyx_L0;\n\n    /* \"(tree fragment)\":12\n *     else:\n *         use_setstate = self.name is not None\n *     if use_setstate:             # <<<<<<<<<<<<<<\n *         return __pyx_unpickle_Enum, (type(self), 0xb068931, None), state\n *     else:\n */\n  }\n\n  /* \"(tree fragment)\":15\n *         return __pyx_unpickle_Enum, (type(self), 0xb068931, None), state\n *     else:\n *         return __pyx_unpickle_Enum, (type(self), 0xb068931, state)             # <<<<<<<<<<<<<<\n * def __setstate_cython__(self, __pyx_state):\n *     __pyx_unpickle_Enum__set_state(self, __pyx_state)\n */\n  /*else*/ {\n    __Pyx_XDECREF(__pyx_r);\n    __Pyx_GetModuleGlobalName(__pyx_t_5, __pyx_n_s_pyx_unpickle_Enum); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 15, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_5);\n    __pyx_t_1 = PyTuple_New(3); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 15, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_1);\n    __Pyx_INCREF(((PyObject *)Py_TYPE(((PyObject *)__pyx_v_self))));\n    __Pyx_GIVEREF(((PyObject *)Py_TYPE(((PyObject *)__pyx_v_self))));\n    PyTuple_SET_ITEM(__pyx_t_1, 0, ((PyObject *)Py_TYPE(((PyObject *)__pyx_v_self))));\n    __Pyx_INCREF(__pyx_int_184977713);\n    __Pyx_GIVEREF(__pyx_int_184977713);\n    PyTuple_SET_ITEM(__pyx_t_1, 1, __pyx_int_184977713);\n    __Pyx_INCREF(__pyx_v_state);\n    __Pyx_GIVEREF(__pyx_v_state);\n    PyTuple_SET_ITEM(__pyx_t_1, 2, __pyx_v_state);\n    __pyx_t_4 = PyTuple_New(2); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 15, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __Pyx_GIVEREF(__pyx_t_5);\n    PyTuple_SET_ITEM(__pyx_t_4, 0, __pyx_t_5);\n    __Pyx_GIVEREF(__pyx_t_1);\n    PyTuple_SET_ITEM(__pyx_t_4, 1, __pyx_t_1);\n    __pyx_t_5 = 0;\n    __pyx_t_1 = 0;\n    __pyx_r = __pyx_t_4;\n    __pyx_t_4 = 0;\n    goto __pyx_L0;\n  }\n\n  /* \"(tree fragment)\":1\n * def __reduce_cython__(self):             # <<<<<<<<<<<<<<\n *     cdef tuple state\n *     cdef object _dict\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_4);\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_AddTraceback(\"View.MemoryView.Enum.__reduce_cython__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XDECREF(__pyx_v_state);\n  __Pyx_XDECREF(__pyx_v__dict);\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"(tree fragment)\":16\n *     else:\n *         return __pyx_unpickle_Enum, (type(self), 0xb068931, state)\n * def __setstate_cython__(self, __pyx_state):             # <<<<<<<<<<<<<<\n *     __pyx_unpickle_Enum__set_state(self, __pyx_state)\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw___pyx_MemviewEnum_3__setstate_cython__(PyObject *__pyx_v_self, PyObject *__pyx_v___pyx_state); /*proto*/\nstatic PyObject *__pyx_pw___pyx_MemviewEnum_3__setstate_cython__(PyObject *__pyx_v_self, PyObject *__pyx_v___pyx_state) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__setstate_cython__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf___pyx_MemviewEnum_2__setstate_cython__(((struct __pyx_MemviewEnum_obj *)__pyx_v_self), ((PyObject *)__pyx_v___pyx_state));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf___pyx_MemviewEnum_2__setstate_cython__(struct __pyx_MemviewEnum_obj *__pyx_v_self, PyObject *__pyx_v___pyx_state) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__setstate_cython__\", 0);\n\n  /* \"(tree fragment)\":17\n *         return __pyx_unpickle_Enum, (type(self), 0xb068931, state)\n * def __setstate_cython__(self, __pyx_state):\n *     __pyx_unpickle_Enum__set_state(self, __pyx_state)             # <<<<<<<<<<<<<<\n */\n  if (!(likely(PyTuple_CheckExact(__pyx_v___pyx_state))||((__pyx_v___pyx_state) == Py_None)||(PyErr_Format(PyExc_TypeError, \"Expected %.16s, got %.200s\", \"tuple\", Py_TYPE(__pyx_v___pyx_state)->tp_name), 0))) __PYX_ERR(2, 17, __pyx_L1_error)\n  __pyx_t_1 = __pyx_unpickle_Enum__set_state(__pyx_v_self, ((PyObject*)__pyx_v___pyx_state)); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 17, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n\n  /* \"(tree fragment)\":16\n *     else:\n *         return __pyx_unpickle_Enum, (type(self), 0xb068931, state)\n * def __setstate_cython__(self, __pyx_state):             # <<<<<<<<<<<<<<\n *     __pyx_unpickle_Enum__set_state(self, __pyx_state)\n */\n\n  /* function exit code */\n  __pyx_r = Py_None; __Pyx_INCREF(Py_None);\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"View.MemoryView.Enum.__setstate_cython__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":298\n * \n * @cname('__pyx_align_pointer')\n * cdef void *align_pointer(void *memory, size_t alignment) nogil:             # <<<<<<<<<<<<<<\n *     \"Align pointer memory on a given boundary\"\n *     cdef Py_intptr_t aligned_p = <Py_intptr_t> memory\n */\n\nstatic void *__pyx_align_pointer(void *__pyx_v_memory, size_t __pyx_v_alignment) {\n  Py_intptr_t __pyx_v_aligned_p;\n  size_t __pyx_v_offset;\n  void *__pyx_r;\n  int __pyx_t_1;\n\n  /* \"View.MemoryView\":300\n * cdef void *align_pointer(void *memory, size_t alignment) nogil:\n *     \"Align pointer memory on a given boundary\"\n *     cdef Py_intptr_t aligned_p = <Py_intptr_t> memory             # <<<<<<<<<<<<<<\n *     cdef size_t offset\n * \n */\n  __pyx_v_aligned_p = ((Py_intptr_t)__pyx_v_memory);\n\n  /* \"View.MemoryView\":304\n * \n *     with cython.cdivision(True):\n *         offset = aligned_p % alignment             # <<<<<<<<<<<<<<\n * \n *     if offset > 0:\n */\n  __pyx_v_offset = (__pyx_v_aligned_p % __pyx_v_alignment);\n\n  /* \"View.MemoryView\":306\n *         offset = aligned_p % alignment\n * \n *     if offset > 0:             # <<<<<<<<<<<<<<\n *         aligned_p += alignment - offset\n * \n */\n  __pyx_t_1 = ((__pyx_v_offset > 0) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":307\n * \n *     if offset > 0:\n *         aligned_p += alignment - offset             # <<<<<<<<<<<<<<\n * \n *     return <void *> aligned_p\n */\n    __pyx_v_aligned_p = (__pyx_v_aligned_p + (__pyx_v_alignment - __pyx_v_offset));\n\n    /* \"View.MemoryView\":306\n *         offset = aligned_p % alignment\n * \n *     if offset > 0:             # <<<<<<<<<<<<<<\n *         aligned_p += alignment - offset\n * \n */\n  }\n\n  /* \"View.MemoryView\":309\n *         aligned_p += alignment - offset\n * \n *     return <void *> aligned_p             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_r = ((void *)__pyx_v_aligned_p);\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":298\n * \n * @cname('__pyx_align_pointer')\n * cdef void *align_pointer(void *memory, size_t alignment) nogil:             # <<<<<<<<<<<<<<\n *     \"Align pointer memory on a given boundary\"\n *     cdef Py_intptr_t aligned_p = <Py_intptr_t> memory\n */\n\n  /* function exit code */\n  __pyx_L0:;\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":345\n *     cdef __Pyx_TypeInfo *typeinfo\n * \n *     def __cinit__(memoryview self, object obj, int flags, bint dtype_is_object=False):             # <<<<<<<<<<<<<<\n *         self.obj = obj\n *         self.flags = flags\n */\n\n/* Python wrapper */\nstatic int __pyx_memoryview___cinit__(PyObject *__pyx_v_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/\nstatic int __pyx_memoryview___cinit__(PyObject *__pyx_v_self, PyObject *__pyx_args, PyObject *__pyx_kwds) {\n  PyObject *__pyx_v_obj = 0;\n  int __pyx_v_flags;\n  int __pyx_v_dtype_is_object;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__cinit__ (wrapper)\", 0);\n  {\n    static PyObject **__pyx_pyargnames[] = {&__pyx_n_s_obj,&__pyx_n_s_flags,&__pyx_n_s_dtype_is_object,0};\n    PyObject* values[3] = {0,0,0};\n    if (unlikely(__pyx_kwds)) {\n      Py_ssize_t kw_args;\n      const Py_ssize_t pos_args = PyTuple_GET_SIZE(__pyx_args);\n      switch (pos_args) {\n        case  3: values[2] = PyTuple_GET_ITEM(__pyx_args, 2);\n        CYTHON_FALLTHROUGH;\n        case  2: values[1] = PyTuple_GET_ITEM(__pyx_args, 1);\n        CYTHON_FALLTHROUGH;\n        case  1: values[0] = PyTuple_GET_ITEM(__pyx_args, 0);\n        CYTHON_FALLTHROUGH;\n        case  0: break;\n        default: goto __pyx_L5_argtuple_error;\n      }\n      kw_args = PyDict_Size(__pyx_kwds);\n      switch (pos_args) {\n        case  0:\n        if (likely((values[0] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_obj)) != 0)) kw_args--;\n        else goto __pyx_L5_argtuple_error;\n        CYTHON_FALLTHROUGH;\n        case  1:\n        if (likely((values[1] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_flags)) != 0)) kw_args--;\n        else {\n          __Pyx_RaiseArgtupleInvalid(\"__cinit__\", 0, 2, 3, 1); __PYX_ERR(2, 345, __pyx_L3_error)\n        }\n        CYTHON_FALLTHROUGH;\n        case  2:\n        if (kw_args > 0) {\n          PyObject* value = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_dtype_is_object);\n          if (value) { values[2] = value; kw_args--; }\n        }\n      }\n      if (unlikely(kw_args > 0)) {\n        if (unlikely(__Pyx_ParseOptionalKeywords(__pyx_kwds, __pyx_pyargnames, 0, values, pos_args, \"__cinit__\") < 0)) __PYX_ERR(2, 345, __pyx_L3_error)\n      }\n    } else {\n      switch (PyTuple_GET_SIZE(__pyx_args)) {\n        case  3: values[2] = PyTuple_GET_ITEM(__pyx_args, 2);\n        CYTHON_FALLTHROUGH;\n        case  2: values[1] = PyTuple_GET_ITEM(__pyx_args, 1);\n        values[0] = PyTuple_GET_ITEM(__pyx_args, 0);\n        break;\n        default: goto __pyx_L5_argtuple_error;\n      }\n    }\n    __pyx_v_obj = values[0];\n    __pyx_v_flags = __Pyx_PyInt_As_int(values[1]); if (unlikely((__pyx_v_flags == (int)-1) && PyErr_Occurred())) __PYX_ERR(2, 345, __pyx_L3_error)\n    if (values[2]) {\n      __pyx_v_dtype_is_object = __Pyx_PyObject_IsTrue(values[2]); if (unlikely((__pyx_v_dtype_is_object == (int)-1) && PyErr_Occurred())) __PYX_ERR(2, 345, __pyx_L3_error)\n    } else {\n      __pyx_v_dtype_is_object = ((int)0);\n    }\n  }\n  goto __pyx_L4_argument_unpacking_done;\n  __pyx_L5_argtuple_error:;\n  __Pyx_RaiseArgtupleInvalid(\"__cinit__\", 0, 2, 3, PyTuple_GET_SIZE(__pyx_args)); __PYX_ERR(2, 345, __pyx_L3_error)\n  __pyx_L3_error:;\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.__cinit__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __Pyx_RefNannyFinishContext();\n  return -1;\n  __pyx_L4_argument_unpacking_done:;\n  __pyx_r = __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview___cinit__(((struct __pyx_memoryview_obj *)__pyx_v_self), __pyx_v_obj, __pyx_v_flags, __pyx_v_dtype_is_object);\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic int __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview___cinit__(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_obj, int __pyx_v_flags, int __pyx_v_dtype_is_object) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  int __pyx_t_2;\n  int __pyx_t_3;\n  int __pyx_t_4;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__cinit__\", 0);\n\n  /* \"View.MemoryView\":346\n * \n *     def __cinit__(memoryview self, object obj, int flags, bint dtype_is_object=False):\n *         self.obj = obj             # <<<<<<<<<<<<<<\n *         self.flags = flags\n *         if type(self) is memoryview or obj is not None:\n */\n  __Pyx_INCREF(__pyx_v_obj);\n  __Pyx_GIVEREF(__pyx_v_obj);\n  __Pyx_GOTREF(__pyx_v_self->obj);\n  __Pyx_DECREF(__pyx_v_self->obj);\n  __pyx_v_self->obj = __pyx_v_obj;\n\n  /* \"View.MemoryView\":347\n *     def __cinit__(memoryview self, object obj, int flags, bint dtype_is_object=False):\n *         self.obj = obj\n *         self.flags = flags             # <<<<<<<<<<<<<<\n *         if type(self) is memoryview or obj is not None:\n *             __Pyx_GetBuffer(obj, &self.view, flags)\n */\n  __pyx_v_self->flags = __pyx_v_flags;\n\n  /* \"View.MemoryView\":348\n *         self.obj = obj\n *         self.flags = flags\n *         if type(self) is memoryview or obj is not None:             # <<<<<<<<<<<<<<\n *             __Pyx_GetBuffer(obj, &self.view, flags)\n *             if <PyObject *> self.view.obj == NULL:\n */\n  __pyx_t_2 = (((PyObject *)Py_TYPE(((PyObject *)__pyx_v_self))) == ((PyObject *)__pyx_memoryview_type));\n  __pyx_t_3 = (__pyx_t_2 != 0);\n  if (!__pyx_t_3) {\n  } else {\n    __pyx_t_1 = __pyx_t_3;\n    goto __pyx_L4_bool_binop_done;\n  }\n  __pyx_t_3 = (__pyx_v_obj != Py_None);\n  __pyx_t_2 = (__pyx_t_3 != 0);\n  __pyx_t_1 = __pyx_t_2;\n  __pyx_L4_bool_binop_done:;\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":349\n *         self.flags = flags\n *         if type(self) is memoryview or obj is not None:\n *             __Pyx_GetBuffer(obj, &self.view, flags)             # <<<<<<<<<<<<<<\n *             if <PyObject *> self.view.obj == NULL:\n *                 (<__pyx_buffer *> &self.view).obj = Py_None\n */\n    __pyx_t_4 = __Pyx_GetBuffer(__pyx_v_obj, (&__pyx_v_self->view), __pyx_v_flags); if (unlikely(__pyx_t_4 == ((int)-1))) __PYX_ERR(2, 349, __pyx_L1_error)\n\n    /* \"View.MemoryView\":350\n *         if type(self) is memoryview or obj is not None:\n *             __Pyx_GetBuffer(obj, &self.view, flags)\n *             if <PyObject *> self.view.obj == NULL:             # <<<<<<<<<<<<<<\n *                 (<__pyx_buffer *> &self.view).obj = Py_None\n *                 Py_INCREF(Py_None)\n */\n    __pyx_t_1 = ((((PyObject *)__pyx_v_self->view.obj) == NULL) != 0);\n    if (__pyx_t_1) {\n\n      /* \"View.MemoryView\":351\n *             __Pyx_GetBuffer(obj, &self.view, flags)\n *             if <PyObject *> self.view.obj == NULL:\n *                 (<__pyx_buffer *> &self.view).obj = Py_None             # <<<<<<<<<<<<<<\n *                 Py_INCREF(Py_None)\n * \n */\n      ((Py_buffer *)(&__pyx_v_self->view))->obj = Py_None;\n\n      /* \"View.MemoryView\":352\n *             if <PyObject *> self.view.obj == NULL:\n *                 (<__pyx_buffer *> &self.view).obj = Py_None\n *                 Py_INCREF(Py_None)             # <<<<<<<<<<<<<<\n * \n *         global __pyx_memoryview_thread_locks_used\n */\n      Py_INCREF(Py_None);\n\n      /* \"View.MemoryView\":350\n *         if type(self) is memoryview or obj is not None:\n *             __Pyx_GetBuffer(obj, &self.view, flags)\n *             if <PyObject *> self.view.obj == NULL:             # <<<<<<<<<<<<<<\n *                 (<__pyx_buffer *> &self.view).obj = Py_None\n *                 Py_INCREF(Py_None)\n */\n    }\n\n    /* \"View.MemoryView\":348\n *         self.obj = obj\n *         self.flags = flags\n *         if type(self) is memoryview or obj is not None:             # <<<<<<<<<<<<<<\n *             __Pyx_GetBuffer(obj, &self.view, flags)\n *             if <PyObject *> self.view.obj == NULL:\n */\n  }\n\n  /* \"View.MemoryView\":355\n * \n *         global __pyx_memoryview_thread_locks_used\n *         if __pyx_memoryview_thread_locks_used < THREAD_LOCKS_PREALLOCATED:             # <<<<<<<<<<<<<<\n *             self.lock = __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used]\n *             __pyx_memoryview_thread_locks_used += 1\n */\n  __pyx_t_1 = ((__pyx_memoryview_thread_locks_used < 8) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":356\n *         global __pyx_memoryview_thread_locks_used\n *         if __pyx_memoryview_thread_locks_used < THREAD_LOCKS_PREALLOCATED:\n *             self.lock = __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used]             # <<<<<<<<<<<<<<\n *             __pyx_memoryview_thread_locks_used += 1\n *         if self.lock is NULL:\n */\n    __pyx_v_self->lock = (__pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used]);\n\n    /* \"View.MemoryView\":357\n *         if __pyx_memoryview_thread_locks_used < THREAD_LOCKS_PREALLOCATED:\n *             self.lock = __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used]\n *             __pyx_memoryview_thread_locks_used += 1             # <<<<<<<<<<<<<<\n *         if self.lock is NULL:\n *             self.lock = PyThread_allocate_lock()\n */\n    __pyx_memoryview_thread_locks_used = (__pyx_memoryview_thread_locks_used + 1);\n\n    /* \"View.MemoryView\":355\n * \n *         global __pyx_memoryview_thread_locks_used\n *         if __pyx_memoryview_thread_locks_used < THREAD_LOCKS_PREALLOCATED:             # <<<<<<<<<<<<<<\n *             self.lock = __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used]\n *             __pyx_memoryview_thread_locks_used += 1\n */\n  }\n\n  /* \"View.MemoryView\":358\n *             self.lock = __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used]\n *             __pyx_memoryview_thread_locks_used += 1\n *         if self.lock is NULL:             # <<<<<<<<<<<<<<\n *             self.lock = PyThread_allocate_lock()\n *             if self.lock is NULL:\n */\n  __pyx_t_1 = ((__pyx_v_self->lock == NULL) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":359\n *             __pyx_memoryview_thread_locks_used += 1\n *         if self.lock is NULL:\n *             self.lock = PyThread_allocate_lock()             # <<<<<<<<<<<<<<\n *             if self.lock is NULL:\n *                 raise MemoryError\n */\n    __pyx_v_self->lock = PyThread_allocate_lock();\n\n    /* \"View.MemoryView\":360\n *         if self.lock is NULL:\n *             self.lock = PyThread_allocate_lock()\n *             if self.lock is NULL:             # <<<<<<<<<<<<<<\n *                 raise MemoryError\n * \n */\n    __pyx_t_1 = ((__pyx_v_self->lock == NULL) != 0);\n    if (unlikely(__pyx_t_1)) {\n\n      /* \"View.MemoryView\":361\n *             self.lock = PyThread_allocate_lock()\n *             if self.lock is NULL:\n *                 raise MemoryError             # <<<<<<<<<<<<<<\n * \n *         if flags & PyBUF_FORMAT:\n */\n      PyErr_NoMemory(); __PYX_ERR(2, 361, __pyx_L1_error)\n\n      /* \"View.MemoryView\":360\n *         if self.lock is NULL:\n *             self.lock = PyThread_allocate_lock()\n *             if self.lock is NULL:             # <<<<<<<<<<<<<<\n *                 raise MemoryError\n * \n */\n    }\n\n    /* \"View.MemoryView\":358\n *             self.lock = __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used]\n *             __pyx_memoryview_thread_locks_used += 1\n *         if self.lock is NULL:             # <<<<<<<<<<<<<<\n *             self.lock = PyThread_allocate_lock()\n *             if self.lock is NULL:\n */\n  }\n\n  /* \"View.MemoryView\":363\n *                 raise MemoryError\n * \n *         if flags & PyBUF_FORMAT:             # <<<<<<<<<<<<<<\n *             self.dtype_is_object = (self.view.format[0] == b'O' and self.view.format[1] == b'\\0')\n *         else:\n */\n  __pyx_t_1 = ((__pyx_v_flags & PyBUF_FORMAT) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":364\n * \n *         if flags & PyBUF_FORMAT:\n *             self.dtype_is_object = (self.view.format[0] == b'O' and self.view.format[1] == b'\\0')             # <<<<<<<<<<<<<<\n *         else:\n *             self.dtype_is_object = dtype_is_object\n */\n    __pyx_t_2 = (((__pyx_v_self->view.format[0]) == 'O') != 0);\n    if (__pyx_t_2) {\n    } else {\n      __pyx_t_1 = __pyx_t_2;\n      goto __pyx_L11_bool_binop_done;\n    }\n    __pyx_t_2 = (((__pyx_v_self->view.format[1]) == '\\x00') != 0);\n    __pyx_t_1 = __pyx_t_2;\n    __pyx_L11_bool_binop_done:;\n    __pyx_v_self->dtype_is_object = __pyx_t_1;\n\n    /* \"View.MemoryView\":363\n *                 raise MemoryError\n * \n *         if flags & PyBUF_FORMAT:             # <<<<<<<<<<<<<<\n *             self.dtype_is_object = (self.view.format[0] == b'O' and self.view.format[1] == b'\\0')\n *         else:\n */\n    goto __pyx_L10;\n  }\n\n  /* \"View.MemoryView\":366\n *             self.dtype_is_object = (self.view.format[0] == b'O' and self.view.format[1] == b'\\0')\n *         else:\n *             self.dtype_is_object = dtype_is_object             # <<<<<<<<<<<<<<\n * \n *         self.acquisition_count_aligned_p = <__pyx_atomic_int *> align_pointer(\n */\n  /*else*/ {\n    __pyx_v_self->dtype_is_object = __pyx_v_dtype_is_object;\n  }\n  __pyx_L10:;\n\n  /* \"View.MemoryView\":368\n *             self.dtype_is_object = dtype_is_object\n * \n *         self.acquisition_count_aligned_p = <__pyx_atomic_int *> align_pointer(             # <<<<<<<<<<<<<<\n *                   <void *> &self.acquisition_count[0], sizeof(__pyx_atomic_int))\n *         self.typeinfo = NULL\n */\n  __pyx_v_self->acquisition_count_aligned_p = ((__pyx_atomic_int *)__pyx_align_pointer(((void *)(&(__pyx_v_self->acquisition_count[0]))), (sizeof(__pyx_atomic_int))));\n\n  /* \"View.MemoryView\":370\n *         self.acquisition_count_aligned_p = <__pyx_atomic_int *> align_pointer(\n *                   <void *> &self.acquisition_count[0], sizeof(__pyx_atomic_int))\n *         self.typeinfo = NULL             # <<<<<<<<<<<<<<\n * \n *     def __dealloc__(memoryview self):\n */\n  __pyx_v_self->typeinfo = NULL;\n\n  /* \"View.MemoryView\":345\n *     cdef __Pyx_TypeInfo *typeinfo\n * \n *     def __cinit__(memoryview self, object obj, int flags, bint dtype_is_object=False):             # <<<<<<<<<<<<<<\n *         self.obj = obj\n *         self.flags = flags\n */\n\n  /* function exit code */\n  __pyx_r = 0;\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.__cinit__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = -1;\n  __pyx_L0:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":372\n *         self.typeinfo = NULL\n * \n *     def __dealloc__(memoryview self):             # <<<<<<<<<<<<<<\n *         if self.obj is not None:\n *             __Pyx_ReleaseBuffer(&self.view)\n */\n\n/* Python wrapper */\nstatic void __pyx_memoryview___dealloc__(PyObject *__pyx_v_self); /*proto*/\nstatic void __pyx_memoryview___dealloc__(PyObject *__pyx_v_self) {\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__dealloc__ (wrapper)\", 0);\n  __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_2__dealloc__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n}\n\nstatic void __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_2__dealloc__(struct __pyx_memoryview_obj *__pyx_v_self) {\n  int __pyx_v_i;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  int __pyx_t_2;\n  int __pyx_t_3;\n  int __pyx_t_4;\n  int __pyx_t_5;\n  PyThread_type_lock __pyx_t_6;\n  PyThread_type_lock __pyx_t_7;\n  __Pyx_RefNannySetupContext(\"__dealloc__\", 0);\n\n  /* \"View.MemoryView\":373\n * \n *     def __dealloc__(memoryview self):\n *         if self.obj is not None:             # <<<<<<<<<<<<<<\n *             __Pyx_ReleaseBuffer(&self.view)\n *         elif (<__pyx_buffer *> &self.view).obj == Py_None:\n */\n  __pyx_t_1 = (__pyx_v_self->obj != Py_None);\n  __pyx_t_2 = (__pyx_t_1 != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":374\n *     def __dealloc__(memoryview self):\n *         if self.obj is not None:\n *             __Pyx_ReleaseBuffer(&self.view)             # <<<<<<<<<<<<<<\n *         elif (<__pyx_buffer *> &self.view).obj == Py_None:\n * \n */\n    __Pyx_ReleaseBuffer((&__pyx_v_self->view));\n\n    /* \"View.MemoryView\":373\n * \n *     def __dealloc__(memoryview self):\n *         if self.obj is not None:             # <<<<<<<<<<<<<<\n *             __Pyx_ReleaseBuffer(&self.view)\n *         elif (<__pyx_buffer *> &self.view).obj == Py_None:\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":375\n *         if self.obj is not None:\n *             __Pyx_ReleaseBuffer(&self.view)\n *         elif (<__pyx_buffer *> &self.view).obj == Py_None:             # <<<<<<<<<<<<<<\n * \n *             (<__pyx_buffer *> &self.view).obj = NULL\n */\n  __pyx_t_2 = ((((Py_buffer *)(&__pyx_v_self->view))->obj == Py_None) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":377\n *         elif (<__pyx_buffer *> &self.view).obj == Py_None:\n * \n *             (<__pyx_buffer *> &self.view).obj = NULL             # <<<<<<<<<<<<<<\n *             Py_DECREF(Py_None)\n * \n */\n    ((Py_buffer *)(&__pyx_v_self->view))->obj = NULL;\n\n    /* \"View.MemoryView\":378\n * \n *             (<__pyx_buffer *> &self.view).obj = NULL\n *             Py_DECREF(Py_None)             # <<<<<<<<<<<<<<\n * \n *         cdef int i\n */\n    Py_DECREF(Py_None);\n\n    /* \"View.MemoryView\":375\n *         if self.obj is not None:\n *             __Pyx_ReleaseBuffer(&self.view)\n *         elif (<__pyx_buffer *> &self.view).obj == Py_None:             # <<<<<<<<<<<<<<\n * \n *             (<__pyx_buffer *> &self.view).obj = NULL\n */\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":382\n *         cdef int i\n *         global __pyx_memoryview_thread_locks_used\n *         if self.lock != NULL:             # <<<<<<<<<<<<<<\n *             for i in range(__pyx_memoryview_thread_locks_used):\n *                 if __pyx_memoryview_thread_locks[i] is self.lock:\n */\n  __pyx_t_2 = ((__pyx_v_self->lock != NULL) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":383\n *         global __pyx_memoryview_thread_locks_used\n *         if self.lock != NULL:\n *             for i in range(__pyx_memoryview_thread_locks_used):             # <<<<<<<<<<<<<<\n *                 if __pyx_memoryview_thread_locks[i] is self.lock:\n *                     __pyx_memoryview_thread_locks_used -= 1\n */\n    __pyx_t_3 = __pyx_memoryview_thread_locks_used;\n    __pyx_t_4 = __pyx_t_3;\n    for (__pyx_t_5 = 0; __pyx_t_5 < __pyx_t_4; __pyx_t_5+=1) {\n      __pyx_v_i = __pyx_t_5;\n\n      /* \"View.MemoryView\":384\n *         if self.lock != NULL:\n *             for i in range(__pyx_memoryview_thread_locks_used):\n *                 if __pyx_memoryview_thread_locks[i] is self.lock:             # <<<<<<<<<<<<<<\n *                     __pyx_memoryview_thread_locks_used -= 1\n *                     if i != __pyx_memoryview_thread_locks_used:\n */\n      __pyx_t_2 = (((__pyx_memoryview_thread_locks[__pyx_v_i]) == __pyx_v_self->lock) != 0);\n      if (__pyx_t_2) {\n\n        /* \"View.MemoryView\":385\n *             for i in range(__pyx_memoryview_thread_locks_used):\n *                 if __pyx_memoryview_thread_locks[i] is self.lock:\n *                     __pyx_memoryview_thread_locks_used -= 1             # <<<<<<<<<<<<<<\n *                     if i != __pyx_memoryview_thread_locks_used:\n *                         __pyx_memoryview_thread_locks[i], __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used] = (\n */\n        __pyx_memoryview_thread_locks_used = (__pyx_memoryview_thread_locks_used - 1);\n\n        /* \"View.MemoryView\":386\n *                 if __pyx_memoryview_thread_locks[i] is self.lock:\n *                     __pyx_memoryview_thread_locks_used -= 1\n *                     if i != __pyx_memoryview_thread_locks_used:             # <<<<<<<<<<<<<<\n *                         __pyx_memoryview_thread_locks[i], __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used] = (\n *                             __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used], __pyx_memoryview_thread_locks[i])\n */\n        __pyx_t_2 = ((__pyx_v_i != __pyx_memoryview_thread_locks_used) != 0);\n        if (__pyx_t_2) {\n\n          /* \"View.MemoryView\":388\n *                     if i != __pyx_memoryview_thread_locks_used:\n *                         __pyx_memoryview_thread_locks[i], __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used] = (\n *                             __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used], __pyx_memoryview_thread_locks[i])             # <<<<<<<<<<<<<<\n *                     break\n *             else:\n */\n          __pyx_t_6 = (__pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used]);\n          __pyx_t_7 = (__pyx_memoryview_thread_locks[__pyx_v_i]);\n\n          /* \"View.MemoryView\":387\n *                     __pyx_memoryview_thread_locks_used -= 1\n *                     if i != __pyx_memoryview_thread_locks_used:\n *                         __pyx_memoryview_thread_locks[i], __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used] = (             # <<<<<<<<<<<<<<\n *                             __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used], __pyx_memoryview_thread_locks[i])\n *                     break\n */\n          (__pyx_memoryview_thread_locks[__pyx_v_i]) = __pyx_t_6;\n          (__pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used]) = __pyx_t_7;\n\n          /* \"View.MemoryView\":386\n *                 if __pyx_memoryview_thread_locks[i] is self.lock:\n *                     __pyx_memoryview_thread_locks_used -= 1\n *                     if i != __pyx_memoryview_thread_locks_used:             # <<<<<<<<<<<<<<\n *                         __pyx_memoryview_thread_locks[i], __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used] = (\n *                             __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used], __pyx_memoryview_thread_locks[i])\n */\n        }\n\n        /* \"View.MemoryView\":389\n *                         __pyx_memoryview_thread_locks[i], __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used] = (\n *                             __pyx_memoryview_thread_locks[__pyx_memoryview_thread_locks_used], __pyx_memoryview_thread_locks[i])\n *                     break             # <<<<<<<<<<<<<<\n *             else:\n *                 PyThread_free_lock(self.lock)\n */\n        goto __pyx_L6_break;\n\n        /* \"View.MemoryView\":384\n *         if self.lock != NULL:\n *             for i in range(__pyx_memoryview_thread_locks_used):\n *                 if __pyx_memoryview_thread_locks[i] is self.lock:             # <<<<<<<<<<<<<<\n *                     __pyx_memoryview_thread_locks_used -= 1\n *                     if i != __pyx_memoryview_thread_locks_used:\n */\n      }\n    }\n    /*else*/ {\n\n      /* \"View.MemoryView\":391\n *                     break\n *             else:\n *                 PyThread_free_lock(self.lock)             # <<<<<<<<<<<<<<\n * \n *     cdef char *get_item_pointer(memoryview self, object index) except NULL:\n */\n      PyThread_free_lock(__pyx_v_self->lock);\n    }\n    __pyx_L6_break:;\n\n    /* \"View.MemoryView\":382\n *         cdef int i\n *         global __pyx_memoryview_thread_locks_used\n *         if self.lock != NULL:             # <<<<<<<<<<<<<<\n *             for i in range(__pyx_memoryview_thread_locks_used):\n *                 if __pyx_memoryview_thread_locks[i] is self.lock:\n */\n  }\n\n  /* \"View.MemoryView\":372\n *         self.typeinfo = NULL\n * \n *     def __dealloc__(memoryview self):             # <<<<<<<<<<<<<<\n *         if self.obj is not None:\n *             __Pyx_ReleaseBuffer(&self.view)\n */\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n}\n\n/* \"View.MemoryView\":393\n *                 PyThread_free_lock(self.lock)\n * \n *     cdef char *get_item_pointer(memoryview self, object index) except NULL:             # <<<<<<<<<<<<<<\n *         cdef Py_ssize_t dim\n *         cdef char *itemp = <char *> self.view.buf\n */\n\nstatic char *__pyx_memoryview_get_item_pointer(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_index) {\n  Py_ssize_t __pyx_v_dim;\n  char *__pyx_v_itemp;\n  PyObject *__pyx_v_idx = NULL;\n  char *__pyx_r;\n  __Pyx_RefNannyDeclarations\n  Py_ssize_t __pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  Py_ssize_t __pyx_t_3;\n  PyObject *(*__pyx_t_4)(PyObject *);\n  PyObject *__pyx_t_5 = NULL;\n  Py_ssize_t __pyx_t_6;\n  char *__pyx_t_7;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"get_item_pointer\", 0);\n\n  /* \"View.MemoryView\":395\n *     cdef char *get_item_pointer(memoryview self, object index) except NULL:\n *         cdef Py_ssize_t dim\n *         cdef char *itemp = <char *> self.view.buf             # <<<<<<<<<<<<<<\n * \n *         for dim, idx in enumerate(index):\n */\n  __pyx_v_itemp = ((char *)__pyx_v_self->view.buf);\n\n  /* \"View.MemoryView\":397\n *         cdef char *itemp = <char *> self.view.buf\n * \n *         for dim, idx in enumerate(index):             # <<<<<<<<<<<<<<\n *             itemp = pybuffer_index(&self.view, itemp, idx, dim)\n * \n */\n  __pyx_t_1 = 0;\n  if (likely(PyList_CheckExact(__pyx_v_index)) || PyTuple_CheckExact(__pyx_v_index)) {\n    __pyx_t_2 = __pyx_v_index; __Pyx_INCREF(__pyx_t_2); __pyx_t_3 = 0;\n    __pyx_t_4 = NULL;\n  } else {\n    __pyx_t_3 = -1; __pyx_t_2 = PyObject_GetIter(__pyx_v_index); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 397, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __pyx_t_4 = Py_TYPE(__pyx_t_2)->tp_iternext; if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 397, __pyx_L1_error)\n  }\n  for (;;) {\n    if (likely(!__pyx_t_4)) {\n      if (likely(PyList_CheckExact(__pyx_t_2))) {\n        if (__pyx_t_3 >= PyList_GET_SIZE(__pyx_t_2)) break;\n        #if CYTHON_ASSUME_SAFE_MACROS && !CYTHON_AVOID_BORROWED_REFS\n        __pyx_t_5 = PyList_GET_ITEM(__pyx_t_2, __pyx_t_3); __Pyx_INCREF(__pyx_t_5); __pyx_t_3++; if (unlikely(0 < 0)) __PYX_ERR(2, 397, __pyx_L1_error)\n        #else\n        __pyx_t_5 = PySequence_ITEM(__pyx_t_2, __pyx_t_3); __pyx_t_3++; if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 397, __pyx_L1_error)\n        __Pyx_GOTREF(__pyx_t_5);\n        #endif\n      } else {\n        if (__pyx_t_3 >= PyTuple_GET_SIZE(__pyx_t_2)) break;\n        #if CYTHON_ASSUME_SAFE_MACROS && !CYTHON_AVOID_BORROWED_REFS\n        __pyx_t_5 = PyTuple_GET_ITEM(__pyx_t_2, __pyx_t_3); __Pyx_INCREF(__pyx_t_5); __pyx_t_3++; if (unlikely(0 < 0)) __PYX_ERR(2, 397, __pyx_L1_error)\n        #else\n        __pyx_t_5 = PySequence_ITEM(__pyx_t_2, __pyx_t_3); __pyx_t_3++; if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 397, __pyx_L1_error)\n        __Pyx_GOTREF(__pyx_t_5);\n        #endif\n      }\n    } else {\n      __pyx_t_5 = __pyx_t_4(__pyx_t_2);\n      if (unlikely(!__pyx_t_5)) {\n        PyObject* exc_type = PyErr_Occurred();\n        if (exc_type) {\n          if (likely(__Pyx_PyErr_GivenExceptionMatches(exc_type, PyExc_StopIteration))) PyErr_Clear();\n          else __PYX_ERR(2, 397, __pyx_L1_error)\n        }\n        break;\n      }\n      __Pyx_GOTREF(__pyx_t_5);\n    }\n    __Pyx_XDECREF_SET(__pyx_v_idx, __pyx_t_5);\n    __pyx_t_5 = 0;\n    __pyx_v_dim = __pyx_t_1;\n    __pyx_t_1 = (__pyx_t_1 + 1);\n\n    /* \"View.MemoryView\":398\n * \n *         for dim, idx in enumerate(index):\n *             itemp = pybuffer_index(&self.view, itemp, idx, dim)             # <<<<<<<<<<<<<<\n * \n *         return itemp\n */\n    __pyx_t_6 = __Pyx_PyIndex_AsSsize_t(__pyx_v_idx); if (unlikely((__pyx_t_6 == (Py_ssize_t)-1) && PyErr_Occurred())) __PYX_ERR(2, 398, __pyx_L1_error)\n    __pyx_t_7 = __pyx_pybuffer_index((&__pyx_v_self->view), __pyx_v_itemp, __pyx_t_6, __pyx_v_dim); if (unlikely(__pyx_t_7 == ((char *)NULL))) __PYX_ERR(2, 398, __pyx_L1_error)\n    __pyx_v_itemp = __pyx_t_7;\n\n    /* \"View.MemoryView\":397\n *         cdef char *itemp = <char *> self.view.buf\n * \n *         for dim, idx in enumerate(index):             # <<<<<<<<<<<<<<\n *             itemp = pybuffer_index(&self.view, itemp, idx, dim)\n * \n */\n  }\n  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n\n  /* \"View.MemoryView\":400\n *             itemp = pybuffer_index(&self.view, itemp, idx, dim)\n * \n *         return itemp             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_r = __pyx_v_itemp;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":393\n *                 PyThread_free_lock(self.lock)\n * \n *     cdef char *get_item_pointer(memoryview self, object index) except NULL:             # <<<<<<<<<<<<<<\n *         cdef Py_ssize_t dim\n *         cdef char *itemp = <char *> self.view.buf\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.get_item_pointer\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XDECREF(__pyx_v_idx);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":403\n * \n * \n *     def __getitem__(memoryview self, object index):             # <<<<<<<<<<<<<<\n *         if index is Ellipsis:\n *             return self\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_memoryview___getitem__(PyObject *__pyx_v_self, PyObject *__pyx_v_index); /*proto*/\nstatic PyObject *__pyx_memoryview___getitem__(PyObject *__pyx_v_self, PyObject *__pyx_v_index) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__getitem__ (wrapper)\", 0);\n  __pyx_r = __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_4__getitem__(((struct __pyx_memoryview_obj *)__pyx_v_self), ((PyObject *)__pyx_v_index));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_4__getitem__(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_index) {\n  PyObject *__pyx_v_have_slices = NULL;\n  PyObject *__pyx_v_indices = NULL;\n  char *__pyx_v_itemp;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  int __pyx_t_2;\n  PyObject *__pyx_t_3 = NULL;\n  PyObject *__pyx_t_4 = NULL;\n  PyObject *__pyx_t_5 = NULL;\n  char *__pyx_t_6;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__getitem__\", 0);\n\n  /* \"View.MemoryView\":404\n * \n *     def __getitem__(memoryview self, object index):\n *         if index is Ellipsis:             # <<<<<<<<<<<<<<\n *             return self\n * \n */\n  __pyx_t_1 = (__pyx_v_index == __pyx_builtin_Ellipsis);\n  __pyx_t_2 = (__pyx_t_1 != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":405\n *     def __getitem__(memoryview self, object index):\n *         if index is Ellipsis:\n *             return self             # <<<<<<<<<<<<<<\n * \n *         have_slices, indices = _unellipsify(index, self.view.ndim)\n */\n    __Pyx_XDECREF(__pyx_r);\n    __Pyx_INCREF(((PyObject *)__pyx_v_self));\n    __pyx_r = ((PyObject *)__pyx_v_self);\n    goto __pyx_L0;\n\n    /* \"View.MemoryView\":404\n * \n *     def __getitem__(memoryview self, object index):\n *         if index is Ellipsis:             # <<<<<<<<<<<<<<\n *             return self\n * \n */\n  }\n\n  /* \"View.MemoryView\":407\n *             return self\n * \n *         have_slices, indices = _unellipsify(index, self.view.ndim)             # <<<<<<<<<<<<<<\n * \n *         cdef char *itemp\n */\n  __pyx_t_3 = _unellipsify(__pyx_v_index, __pyx_v_self->view.ndim); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 407, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  if (likely(__pyx_t_3 != Py_None)) {\n    PyObject* sequence = __pyx_t_3;\n    Py_ssize_t size = __Pyx_PySequence_SIZE(sequence);\n    if (unlikely(size != 2)) {\n      if (size > 2) __Pyx_RaiseTooManyValuesError(2);\n      else if (size >= 0) __Pyx_RaiseNeedMoreValuesError(size);\n      __PYX_ERR(2, 407, __pyx_L1_error)\n    }\n    #if CYTHON_ASSUME_SAFE_MACROS && !CYTHON_AVOID_BORROWED_REFS\n    __pyx_t_4 = PyTuple_GET_ITEM(sequence, 0); \n    __pyx_t_5 = PyTuple_GET_ITEM(sequence, 1); \n    __Pyx_INCREF(__pyx_t_4);\n    __Pyx_INCREF(__pyx_t_5);\n    #else\n    __pyx_t_4 = PySequence_ITEM(sequence, 0); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 407, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __pyx_t_5 = PySequence_ITEM(sequence, 1); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 407, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_5);\n    #endif\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n  } else {\n    __Pyx_RaiseNoneNotIterableError(); __PYX_ERR(2, 407, __pyx_L1_error)\n  }\n  __pyx_v_have_slices = __pyx_t_4;\n  __pyx_t_4 = 0;\n  __pyx_v_indices = __pyx_t_5;\n  __pyx_t_5 = 0;\n\n  /* \"View.MemoryView\":410\n * \n *         cdef char *itemp\n *         if have_slices:             # <<<<<<<<<<<<<<\n *             return memview_slice(self, indices)\n *         else:\n */\n  __pyx_t_2 = __Pyx_PyObject_IsTrue(__pyx_v_have_slices); if (unlikely(__pyx_t_2 < 0)) __PYX_ERR(2, 410, __pyx_L1_error)\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":411\n *         cdef char *itemp\n *         if have_slices:\n *             return memview_slice(self, indices)             # <<<<<<<<<<<<<<\n *         else:\n *             itemp = self.get_item_pointer(indices)\n */\n    __Pyx_XDECREF(__pyx_r);\n    __pyx_t_3 = ((PyObject *)__pyx_memview_slice(__pyx_v_self, __pyx_v_indices)); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 411, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __pyx_r = __pyx_t_3;\n    __pyx_t_3 = 0;\n    goto __pyx_L0;\n\n    /* \"View.MemoryView\":410\n * \n *         cdef char *itemp\n *         if have_slices:             # <<<<<<<<<<<<<<\n *             return memview_slice(self, indices)\n *         else:\n */\n  }\n\n  /* \"View.MemoryView\":413\n *             return memview_slice(self, indices)\n *         else:\n *             itemp = self.get_item_pointer(indices)             # <<<<<<<<<<<<<<\n *             return self.convert_item_to_object(itemp)\n * \n */\n  /*else*/ {\n    __pyx_t_6 = ((struct __pyx_vtabstruct_memoryview *)__pyx_v_self->__pyx_vtab)->get_item_pointer(__pyx_v_self, __pyx_v_indices); if (unlikely(__pyx_t_6 == ((char *)NULL))) __PYX_ERR(2, 413, __pyx_L1_error)\n    __pyx_v_itemp = __pyx_t_6;\n\n    /* \"View.MemoryView\":414\n *         else:\n *             itemp = self.get_item_pointer(indices)\n *             return self.convert_item_to_object(itemp)             # <<<<<<<<<<<<<<\n * \n *     def __setitem__(memoryview self, object index, object value):\n */\n    __Pyx_XDECREF(__pyx_r);\n    __pyx_t_3 = ((struct __pyx_vtabstruct_memoryview *)__pyx_v_self->__pyx_vtab)->convert_item_to_object(__pyx_v_self, __pyx_v_itemp); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 414, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __pyx_r = __pyx_t_3;\n    __pyx_t_3 = 0;\n    goto __pyx_L0;\n  }\n\n  /* \"View.MemoryView\":403\n * \n * \n *     def __getitem__(memoryview self, object index):             # <<<<<<<<<<<<<<\n *         if index is Ellipsis:\n *             return self\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_XDECREF(__pyx_t_4);\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.__getitem__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XDECREF(__pyx_v_have_slices);\n  __Pyx_XDECREF(__pyx_v_indices);\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":416\n *             return self.convert_item_to_object(itemp)\n * \n *     def __setitem__(memoryview self, object index, object value):             # <<<<<<<<<<<<<<\n *         if self.view.readonly:\n *             raise TypeError(\"Cannot assign to read-only memoryview\")\n */\n\n/* Python wrapper */\nstatic int __pyx_memoryview___setitem__(PyObject *__pyx_v_self, PyObject *__pyx_v_index, PyObject *__pyx_v_value); /*proto*/\nstatic int __pyx_memoryview___setitem__(PyObject *__pyx_v_self, PyObject *__pyx_v_index, PyObject *__pyx_v_value) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__setitem__ (wrapper)\", 0);\n  __pyx_r = __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_6__setitem__(((struct __pyx_memoryview_obj *)__pyx_v_self), ((PyObject *)__pyx_v_index), ((PyObject *)__pyx_v_value));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic int __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_6__setitem__(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_index, PyObject *__pyx_v_value) {\n  PyObject *__pyx_v_have_slices = NULL;\n  PyObject *__pyx_v_obj = NULL;\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  PyObject *__pyx_t_4 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__setitem__\", 0);\n  __Pyx_INCREF(__pyx_v_index);\n\n  /* \"View.MemoryView\":417\n * \n *     def __setitem__(memoryview self, object index, object value):\n *         if self.view.readonly:             # <<<<<<<<<<<<<<\n *             raise TypeError(\"Cannot assign to read-only memoryview\")\n * \n */\n  __pyx_t_1 = (__pyx_v_self->view.readonly != 0);\n  if (unlikely(__pyx_t_1)) {\n\n    /* \"View.MemoryView\":418\n *     def __setitem__(memoryview self, object index, object value):\n *         if self.view.readonly:\n *             raise TypeError(\"Cannot assign to read-only memoryview\")             # <<<<<<<<<<<<<<\n * \n *         have_slices, index = _unellipsify(index, self.view.ndim)\n */\n    __pyx_t_2 = __Pyx_PyObject_Call(__pyx_builtin_TypeError, __pyx_tuple__11, NULL); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 418, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __Pyx_Raise(__pyx_t_2, 0, 0, 0);\n    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n    __PYX_ERR(2, 418, __pyx_L1_error)\n\n    /* \"View.MemoryView\":417\n * \n *     def __setitem__(memoryview self, object index, object value):\n *         if self.view.readonly:             # <<<<<<<<<<<<<<\n *             raise TypeError(\"Cannot assign to read-only memoryview\")\n * \n */\n  }\n\n  /* \"View.MemoryView\":420\n *             raise TypeError(\"Cannot assign to read-only memoryview\")\n * \n *         have_slices, index = _unellipsify(index, self.view.ndim)             # <<<<<<<<<<<<<<\n * \n *         if have_slices:\n */\n  __pyx_t_2 = _unellipsify(__pyx_v_index, __pyx_v_self->view.ndim); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 420, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  if (likely(__pyx_t_2 != Py_None)) {\n    PyObject* sequence = __pyx_t_2;\n    Py_ssize_t size = __Pyx_PySequence_SIZE(sequence);\n    if (unlikely(size != 2)) {\n      if (size > 2) __Pyx_RaiseTooManyValuesError(2);\n      else if (size >= 0) __Pyx_RaiseNeedMoreValuesError(size);\n      __PYX_ERR(2, 420, __pyx_L1_error)\n    }\n    #if CYTHON_ASSUME_SAFE_MACROS && !CYTHON_AVOID_BORROWED_REFS\n    __pyx_t_3 = PyTuple_GET_ITEM(sequence, 0); \n    __pyx_t_4 = PyTuple_GET_ITEM(sequence, 1); \n    __Pyx_INCREF(__pyx_t_3);\n    __Pyx_INCREF(__pyx_t_4);\n    #else\n    __pyx_t_3 = PySequence_ITEM(sequence, 0); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 420, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __pyx_t_4 = PySequence_ITEM(sequence, 1); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 420, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    #endif\n    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n  } else {\n    __Pyx_RaiseNoneNotIterableError(); __PYX_ERR(2, 420, __pyx_L1_error)\n  }\n  __pyx_v_have_slices = __pyx_t_3;\n  __pyx_t_3 = 0;\n  __Pyx_DECREF_SET(__pyx_v_index, __pyx_t_4);\n  __pyx_t_4 = 0;\n\n  /* \"View.MemoryView\":422\n *         have_slices, index = _unellipsify(index, self.view.ndim)\n * \n *         if have_slices:             # <<<<<<<<<<<<<<\n *             obj = self.is_slice(value)\n *             if obj:\n */\n  __pyx_t_1 = __Pyx_PyObject_IsTrue(__pyx_v_have_slices); if (unlikely(__pyx_t_1 < 0)) __PYX_ERR(2, 422, __pyx_L1_error)\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":423\n * \n *         if have_slices:\n *             obj = self.is_slice(value)             # <<<<<<<<<<<<<<\n *             if obj:\n *                 self.setitem_slice_assignment(self[index], obj)\n */\n    __pyx_t_2 = ((struct __pyx_vtabstruct_memoryview *)__pyx_v_self->__pyx_vtab)->is_slice(__pyx_v_self, __pyx_v_value); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 423, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __pyx_v_obj = __pyx_t_2;\n    __pyx_t_2 = 0;\n\n    /* \"View.MemoryView\":424\n *         if have_slices:\n *             obj = self.is_slice(value)\n *             if obj:             # <<<<<<<<<<<<<<\n *                 self.setitem_slice_assignment(self[index], obj)\n *             else:\n */\n    __pyx_t_1 = __Pyx_PyObject_IsTrue(__pyx_v_obj); if (unlikely(__pyx_t_1 < 0)) __PYX_ERR(2, 424, __pyx_L1_error)\n    if (__pyx_t_1) {\n\n      /* \"View.MemoryView\":425\n *             obj = self.is_slice(value)\n *             if obj:\n *                 self.setitem_slice_assignment(self[index], obj)             # <<<<<<<<<<<<<<\n *             else:\n *                 self.setitem_slice_assign_scalar(self[index], value)\n */\n      __pyx_t_2 = __Pyx_PyObject_GetItem(((PyObject *)__pyx_v_self), __pyx_v_index); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 425, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_2);\n      __pyx_t_4 = ((struct __pyx_vtabstruct_memoryview *)__pyx_v_self->__pyx_vtab)->setitem_slice_assignment(__pyx_v_self, __pyx_t_2, __pyx_v_obj); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 425, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_4);\n      __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n      __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;\n\n      /* \"View.MemoryView\":424\n *         if have_slices:\n *             obj = self.is_slice(value)\n *             if obj:             # <<<<<<<<<<<<<<\n *                 self.setitem_slice_assignment(self[index], obj)\n *             else:\n */\n      goto __pyx_L5;\n    }\n\n    /* \"View.MemoryView\":427\n *                 self.setitem_slice_assignment(self[index], obj)\n *             else:\n *                 self.setitem_slice_assign_scalar(self[index], value)             # <<<<<<<<<<<<<<\n *         else:\n *             self.setitem_indexed(index, value)\n */\n    /*else*/ {\n      __pyx_t_4 = __Pyx_PyObject_GetItem(((PyObject *)__pyx_v_self), __pyx_v_index); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 427, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_4);\n      if (!(likely(((__pyx_t_4) == Py_None) || likely(__Pyx_TypeTest(__pyx_t_4, __pyx_memoryview_type))))) __PYX_ERR(2, 427, __pyx_L1_error)\n      __pyx_t_2 = ((struct __pyx_vtabstruct_memoryview *)__pyx_v_self->__pyx_vtab)->setitem_slice_assign_scalar(__pyx_v_self, ((struct __pyx_memoryview_obj *)__pyx_t_4), __pyx_v_value); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 427, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_2);\n      __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;\n      __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n    }\n    __pyx_L5:;\n\n    /* \"View.MemoryView\":422\n *         have_slices, index = _unellipsify(index, self.view.ndim)\n * \n *         if have_slices:             # <<<<<<<<<<<<<<\n *             obj = self.is_slice(value)\n *             if obj:\n */\n    goto __pyx_L4;\n  }\n\n  /* \"View.MemoryView\":429\n *                 self.setitem_slice_assign_scalar(self[index], value)\n *         else:\n *             self.setitem_indexed(index, value)             # <<<<<<<<<<<<<<\n * \n *     cdef is_slice(self, obj):\n */\n  /*else*/ {\n    __pyx_t_2 = ((struct __pyx_vtabstruct_memoryview *)__pyx_v_self->__pyx_vtab)->setitem_indexed(__pyx_v_self, __pyx_v_index, __pyx_v_value); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 429, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n  }\n  __pyx_L4:;\n\n  /* \"View.MemoryView\":416\n *             return self.convert_item_to_object(itemp)\n * \n *     def __setitem__(memoryview self, object index, object value):             # <<<<<<<<<<<<<<\n *         if self.view.readonly:\n *             raise TypeError(\"Cannot assign to read-only memoryview\")\n */\n\n  /* function exit code */\n  __pyx_r = 0;\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_XDECREF(__pyx_t_4);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.__setitem__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = -1;\n  __pyx_L0:;\n  __Pyx_XDECREF(__pyx_v_have_slices);\n  __Pyx_XDECREF(__pyx_v_obj);\n  __Pyx_XDECREF(__pyx_v_index);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":431\n *             self.setitem_indexed(index, value)\n * \n *     cdef is_slice(self, obj):             # <<<<<<<<<<<<<<\n *         if not isinstance(obj, memoryview):\n *             try:\n */\n\nstatic PyObject *__pyx_memoryview_is_slice(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_obj) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  int __pyx_t_2;\n  PyObject *__pyx_t_3 = NULL;\n  PyObject *__pyx_t_4 = NULL;\n  PyObject *__pyx_t_5 = NULL;\n  PyObject *__pyx_t_6 = NULL;\n  PyObject *__pyx_t_7 = NULL;\n  PyObject *__pyx_t_8 = NULL;\n  int __pyx_t_9;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"is_slice\", 0);\n  __Pyx_INCREF(__pyx_v_obj);\n\n  /* \"View.MemoryView\":432\n * \n *     cdef is_slice(self, obj):\n *         if not isinstance(obj, memoryview):             # <<<<<<<<<<<<<<\n *             try:\n *                 obj = memoryview(obj, self.flags & ~PyBUF_WRITABLE | PyBUF_ANY_CONTIGUOUS,\n */\n  __pyx_t_1 = __Pyx_TypeCheck(__pyx_v_obj, __pyx_memoryview_type); \n  __pyx_t_2 = ((!(__pyx_t_1 != 0)) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":433\n *     cdef is_slice(self, obj):\n *         if not isinstance(obj, memoryview):\n *             try:             # <<<<<<<<<<<<<<\n *                 obj = memoryview(obj, self.flags & ~PyBUF_WRITABLE | PyBUF_ANY_CONTIGUOUS,\n *                                  self.dtype_is_object)\n */\n    {\n      __Pyx_PyThreadState_declare\n      __Pyx_PyThreadState_assign\n      __Pyx_ExceptionSave(&__pyx_t_3, &__pyx_t_4, &__pyx_t_5);\n      __Pyx_XGOTREF(__pyx_t_3);\n      __Pyx_XGOTREF(__pyx_t_4);\n      __Pyx_XGOTREF(__pyx_t_5);\n      /*try:*/ {\n\n        /* \"View.MemoryView\":434\n *         if not isinstance(obj, memoryview):\n *             try:\n *                 obj = memoryview(obj, self.flags & ~PyBUF_WRITABLE | PyBUF_ANY_CONTIGUOUS,             # <<<<<<<<<<<<<<\n *                                  self.dtype_is_object)\n *             except TypeError:\n */\n        __pyx_t_6 = __Pyx_PyInt_From_int(((__pyx_v_self->flags & (~PyBUF_WRITABLE)) | PyBUF_ANY_CONTIGUOUS)); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 434, __pyx_L4_error)\n        __Pyx_GOTREF(__pyx_t_6);\n\n        /* \"View.MemoryView\":435\n *             try:\n *                 obj = memoryview(obj, self.flags & ~PyBUF_WRITABLE | PyBUF_ANY_CONTIGUOUS,\n *                                  self.dtype_is_object)             # <<<<<<<<<<<<<<\n *             except TypeError:\n *                 return None\n */\n        __pyx_t_7 = __Pyx_PyBool_FromLong(__pyx_v_self->dtype_is_object); if (unlikely(!__pyx_t_7)) __PYX_ERR(2, 435, __pyx_L4_error)\n        __Pyx_GOTREF(__pyx_t_7);\n\n        /* \"View.MemoryView\":434\n *         if not isinstance(obj, memoryview):\n *             try:\n *                 obj = memoryview(obj, self.flags & ~PyBUF_WRITABLE | PyBUF_ANY_CONTIGUOUS,             # <<<<<<<<<<<<<<\n *                                  self.dtype_is_object)\n *             except TypeError:\n */\n        __pyx_t_8 = PyTuple_New(3); if (unlikely(!__pyx_t_8)) __PYX_ERR(2, 434, __pyx_L4_error)\n        __Pyx_GOTREF(__pyx_t_8);\n        __Pyx_INCREF(__pyx_v_obj);\n        __Pyx_GIVEREF(__pyx_v_obj);\n        PyTuple_SET_ITEM(__pyx_t_8, 0, __pyx_v_obj);\n        __Pyx_GIVEREF(__pyx_t_6);\n        PyTuple_SET_ITEM(__pyx_t_8, 1, __pyx_t_6);\n        __Pyx_GIVEREF(__pyx_t_7);\n        PyTuple_SET_ITEM(__pyx_t_8, 2, __pyx_t_7);\n        __pyx_t_6 = 0;\n        __pyx_t_7 = 0;\n        __pyx_t_7 = __Pyx_PyObject_Call(((PyObject *)__pyx_memoryview_type), __pyx_t_8, NULL); if (unlikely(!__pyx_t_7)) __PYX_ERR(2, 434, __pyx_L4_error)\n        __Pyx_GOTREF(__pyx_t_7);\n        __Pyx_DECREF(__pyx_t_8); __pyx_t_8 = 0;\n        __Pyx_DECREF_SET(__pyx_v_obj, __pyx_t_7);\n        __pyx_t_7 = 0;\n\n        /* \"View.MemoryView\":433\n *     cdef is_slice(self, obj):\n *         if not isinstance(obj, memoryview):\n *             try:             # <<<<<<<<<<<<<<\n *                 obj = memoryview(obj, self.flags & ~PyBUF_WRITABLE | PyBUF_ANY_CONTIGUOUS,\n *                                  self.dtype_is_object)\n */\n      }\n      __Pyx_XDECREF(__pyx_t_3); __pyx_t_3 = 0;\n      __Pyx_XDECREF(__pyx_t_4); __pyx_t_4 = 0;\n      __Pyx_XDECREF(__pyx_t_5); __pyx_t_5 = 0;\n      goto __pyx_L9_try_end;\n      __pyx_L4_error:;\n      __Pyx_XDECREF(__pyx_t_6); __pyx_t_6 = 0;\n      __Pyx_XDECREF(__pyx_t_7); __pyx_t_7 = 0;\n      __Pyx_XDECREF(__pyx_t_8); __pyx_t_8 = 0;\n\n      /* \"View.MemoryView\":436\n *                 obj = memoryview(obj, self.flags & ~PyBUF_WRITABLE | PyBUF_ANY_CONTIGUOUS,\n *                                  self.dtype_is_object)\n *             except TypeError:             # <<<<<<<<<<<<<<\n *                 return None\n * \n */\n      __pyx_t_9 = __Pyx_PyErr_ExceptionMatches(__pyx_builtin_TypeError);\n      if (__pyx_t_9) {\n        __Pyx_AddTraceback(\"View.MemoryView.memoryview.is_slice\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n        if (__Pyx_GetException(&__pyx_t_7, &__pyx_t_8, &__pyx_t_6) < 0) __PYX_ERR(2, 436, __pyx_L6_except_error)\n        __Pyx_GOTREF(__pyx_t_7);\n        __Pyx_GOTREF(__pyx_t_8);\n        __Pyx_GOTREF(__pyx_t_6);\n\n        /* \"View.MemoryView\":437\n *                                  self.dtype_is_object)\n *             except TypeError:\n *                 return None             # <<<<<<<<<<<<<<\n * \n *         return obj\n */\n        __Pyx_XDECREF(__pyx_r);\n        __pyx_r = Py_None; __Pyx_INCREF(Py_None);\n        __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;\n        __Pyx_DECREF(__pyx_t_7); __pyx_t_7 = 0;\n        __Pyx_DECREF(__pyx_t_8); __pyx_t_8 = 0;\n        goto __pyx_L7_except_return;\n      }\n      goto __pyx_L6_except_error;\n      __pyx_L6_except_error:;\n\n      /* \"View.MemoryView\":433\n *     cdef is_slice(self, obj):\n *         if not isinstance(obj, memoryview):\n *             try:             # <<<<<<<<<<<<<<\n *                 obj = memoryview(obj, self.flags & ~PyBUF_WRITABLE | PyBUF_ANY_CONTIGUOUS,\n *                                  self.dtype_is_object)\n */\n      __Pyx_XGIVEREF(__pyx_t_3);\n      __Pyx_XGIVEREF(__pyx_t_4);\n      __Pyx_XGIVEREF(__pyx_t_5);\n      __Pyx_ExceptionReset(__pyx_t_3, __pyx_t_4, __pyx_t_5);\n      goto __pyx_L1_error;\n      __pyx_L7_except_return:;\n      __Pyx_XGIVEREF(__pyx_t_3);\n      __Pyx_XGIVEREF(__pyx_t_4);\n      __Pyx_XGIVEREF(__pyx_t_5);\n      __Pyx_ExceptionReset(__pyx_t_3, __pyx_t_4, __pyx_t_5);\n      goto __pyx_L0;\n      __pyx_L9_try_end:;\n    }\n\n    /* \"View.MemoryView\":432\n * \n *     cdef is_slice(self, obj):\n *         if not isinstance(obj, memoryview):             # <<<<<<<<<<<<<<\n *             try:\n *                 obj = memoryview(obj, self.flags & ~PyBUF_WRITABLE | PyBUF_ANY_CONTIGUOUS,\n */\n  }\n\n  /* \"View.MemoryView\":439\n *                 return None\n * \n *         return obj             # <<<<<<<<<<<<<<\n * \n *     cdef setitem_slice_assignment(self, dst, src):\n */\n  __Pyx_XDECREF(__pyx_r);\n  __Pyx_INCREF(__pyx_v_obj);\n  __pyx_r = __pyx_v_obj;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":431\n *             self.setitem_indexed(index, value)\n * \n *     cdef is_slice(self, obj):             # <<<<<<<<<<<<<<\n *         if not isinstance(obj, memoryview):\n *             try:\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_6);\n  __Pyx_XDECREF(__pyx_t_7);\n  __Pyx_XDECREF(__pyx_t_8);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.is_slice\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XDECREF(__pyx_v_obj);\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":441\n *         return obj\n * \n *     cdef setitem_slice_assignment(self, dst, src):             # <<<<<<<<<<<<<<\n *         cdef __Pyx_memviewslice dst_slice\n *         cdef __Pyx_memviewslice src_slice\n */\n\nstatic PyObject *__pyx_memoryview_setitem_slice_assignment(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_dst, PyObject *__pyx_v_src) {\n  __Pyx_memviewslice __pyx_v_dst_slice;\n  __Pyx_memviewslice __pyx_v_src_slice;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  __Pyx_memviewslice *__pyx_t_1;\n  __Pyx_memviewslice *__pyx_t_2;\n  PyObject *__pyx_t_3 = NULL;\n  int __pyx_t_4;\n  int __pyx_t_5;\n  int __pyx_t_6;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"setitem_slice_assignment\", 0);\n\n  /* \"View.MemoryView\":445\n *         cdef __Pyx_memviewslice src_slice\n * \n *         memoryview_copy_contents(get_slice_from_memview(src, &src_slice)[0],             # <<<<<<<<<<<<<<\n *                                  get_slice_from_memview(dst, &dst_slice)[0],\n *                                  src.ndim, dst.ndim, self.dtype_is_object)\n */\n  if (!(likely(((__pyx_v_src) == Py_None) || likely(__Pyx_TypeTest(__pyx_v_src, __pyx_memoryview_type))))) __PYX_ERR(2, 445, __pyx_L1_error)\n  __pyx_t_1 = __pyx_memoryview_get_slice_from_memoryview(((struct __pyx_memoryview_obj *)__pyx_v_src), (&__pyx_v_src_slice)); if (unlikely(__pyx_t_1 == ((__Pyx_memviewslice *)NULL))) __PYX_ERR(2, 445, __pyx_L1_error)\n\n  /* \"View.MemoryView\":446\n * \n *         memoryview_copy_contents(get_slice_from_memview(src, &src_slice)[0],\n *                                  get_slice_from_memview(dst, &dst_slice)[0],             # <<<<<<<<<<<<<<\n *                                  src.ndim, dst.ndim, self.dtype_is_object)\n * \n */\n  if (!(likely(((__pyx_v_dst) == Py_None) || likely(__Pyx_TypeTest(__pyx_v_dst, __pyx_memoryview_type))))) __PYX_ERR(2, 446, __pyx_L1_error)\n  __pyx_t_2 = __pyx_memoryview_get_slice_from_memoryview(((struct __pyx_memoryview_obj *)__pyx_v_dst), (&__pyx_v_dst_slice)); if (unlikely(__pyx_t_2 == ((__Pyx_memviewslice *)NULL))) __PYX_ERR(2, 446, __pyx_L1_error)\n\n  /* \"View.MemoryView\":447\n *         memoryview_copy_contents(get_slice_from_memview(src, &src_slice)[0],\n *                                  get_slice_from_memview(dst, &dst_slice)[0],\n *                                  src.ndim, dst.ndim, self.dtype_is_object)             # <<<<<<<<<<<<<<\n * \n *     cdef setitem_slice_assign_scalar(self, memoryview dst, value):\n */\n  __pyx_t_3 = __Pyx_PyObject_GetAttrStr(__pyx_v_src, __pyx_n_s_ndim); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 447, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  __pyx_t_4 = __Pyx_PyInt_As_int(__pyx_t_3); if (unlikely((__pyx_t_4 == (int)-1) && PyErr_Occurred())) __PYX_ERR(2, 447, __pyx_L1_error)\n  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n  __pyx_t_3 = __Pyx_PyObject_GetAttrStr(__pyx_v_dst, __pyx_n_s_ndim); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 447, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  __pyx_t_5 = __Pyx_PyInt_As_int(__pyx_t_3); if (unlikely((__pyx_t_5 == (int)-1) && PyErr_Occurred())) __PYX_ERR(2, 447, __pyx_L1_error)\n  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n\n  /* \"View.MemoryView\":445\n *         cdef __Pyx_memviewslice src_slice\n * \n *         memoryview_copy_contents(get_slice_from_memview(src, &src_slice)[0],             # <<<<<<<<<<<<<<\n *                                  get_slice_from_memview(dst, &dst_slice)[0],\n *                                  src.ndim, dst.ndim, self.dtype_is_object)\n */\n  __pyx_t_6 = __pyx_memoryview_copy_contents((__pyx_t_1[0]), (__pyx_t_2[0]), __pyx_t_4, __pyx_t_5, __pyx_v_self->dtype_is_object); if (unlikely(__pyx_t_6 == ((int)-1))) __PYX_ERR(2, 445, __pyx_L1_error)\n\n  /* \"View.MemoryView\":441\n *         return obj\n * \n *     cdef setitem_slice_assignment(self, dst, src):             # <<<<<<<<<<<<<<\n *         cdef __Pyx_memviewslice dst_slice\n *         cdef __Pyx_memviewslice src_slice\n */\n\n  /* function exit code */\n  __pyx_r = Py_None; __Pyx_INCREF(Py_None);\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.setitem_slice_assignment\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":449\n *                                  src.ndim, dst.ndim, self.dtype_is_object)\n * \n *     cdef setitem_slice_assign_scalar(self, memoryview dst, value):             # <<<<<<<<<<<<<<\n *         cdef int array[128]\n *         cdef void *tmp = NULL\n */\n\nstatic PyObject *__pyx_memoryview_setitem_slice_assign_scalar(struct __pyx_memoryview_obj *__pyx_v_self, struct __pyx_memoryview_obj *__pyx_v_dst, PyObject *__pyx_v_value) {\n  int __pyx_v_array[0x80];\n  void *__pyx_v_tmp;\n  void *__pyx_v_item;\n  __Pyx_memviewslice *__pyx_v_dst_slice;\n  __Pyx_memviewslice __pyx_v_tmp_slice;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  __Pyx_memviewslice *__pyx_t_1;\n  int __pyx_t_2;\n  PyObject *__pyx_t_3 = NULL;\n  int __pyx_t_4;\n  int __pyx_t_5;\n  char const *__pyx_t_6;\n  PyObject *__pyx_t_7 = NULL;\n  PyObject *__pyx_t_8 = NULL;\n  PyObject *__pyx_t_9 = NULL;\n  PyObject *__pyx_t_10 = NULL;\n  PyObject *__pyx_t_11 = NULL;\n  PyObject *__pyx_t_12 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"setitem_slice_assign_scalar\", 0);\n\n  /* \"View.MemoryView\":451\n *     cdef setitem_slice_assign_scalar(self, memoryview dst, value):\n *         cdef int array[128]\n *         cdef void *tmp = NULL             # <<<<<<<<<<<<<<\n *         cdef void *item\n * \n */\n  __pyx_v_tmp = NULL;\n\n  /* \"View.MemoryView\":456\n *         cdef __Pyx_memviewslice *dst_slice\n *         cdef __Pyx_memviewslice tmp_slice\n *         dst_slice = get_slice_from_memview(dst, &tmp_slice)             # <<<<<<<<<<<<<<\n * \n *         if <size_t>self.view.itemsize > sizeof(array):\n */\n  __pyx_t_1 = __pyx_memoryview_get_slice_from_memoryview(__pyx_v_dst, (&__pyx_v_tmp_slice)); if (unlikely(__pyx_t_1 == ((__Pyx_memviewslice *)NULL))) __PYX_ERR(2, 456, __pyx_L1_error)\n  __pyx_v_dst_slice = __pyx_t_1;\n\n  /* \"View.MemoryView\":458\n *         dst_slice = get_slice_from_memview(dst, &tmp_slice)\n * \n *         if <size_t>self.view.itemsize > sizeof(array):             # <<<<<<<<<<<<<<\n *             tmp = PyMem_Malloc(self.view.itemsize)\n *             if tmp == NULL:\n */\n  __pyx_t_2 = ((((size_t)__pyx_v_self->view.itemsize) > (sizeof(__pyx_v_array))) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":459\n * \n *         if <size_t>self.view.itemsize > sizeof(array):\n *             tmp = PyMem_Malloc(self.view.itemsize)             # <<<<<<<<<<<<<<\n *             if tmp == NULL:\n *                 raise MemoryError\n */\n    __pyx_v_tmp = PyMem_Malloc(__pyx_v_self->view.itemsize);\n\n    /* \"View.MemoryView\":460\n *         if <size_t>self.view.itemsize > sizeof(array):\n *             tmp = PyMem_Malloc(self.view.itemsize)\n *             if tmp == NULL:             # <<<<<<<<<<<<<<\n *                 raise MemoryError\n *             item = tmp\n */\n    __pyx_t_2 = ((__pyx_v_tmp == NULL) != 0);\n    if (unlikely(__pyx_t_2)) {\n\n      /* \"View.MemoryView\":461\n *             tmp = PyMem_Malloc(self.view.itemsize)\n *             if tmp == NULL:\n *                 raise MemoryError             # <<<<<<<<<<<<<<\n *             item = tmp\n *         else:\n */\n      PyErr_NoMemory(); __PYX_ERR(2, 461, __pyx_L1_error)\n\n      /* \"View.MemoryView\":460\n *         if <size_t>self.view.itemsize > sizeof(array):\n *             tmp = PyMem_Malloc(self.view.itemsize)\n *             if tmp == NULL:             # <<<<<<<<<<<<<<\n *                 raise MemoryError\n *             item = tmp\n */\n    }\n\n    /* \"View.MemoryView\":462\n *             if tmp == NULL:\n *                 raise MemoryError\n *             item = tmp             # <<<<<<<<<<<<<<\n *         else:\n *             item = <void *> array\n */\n    __pyx_v_item = __pyx_v_tmp;\n\n    /* \"View.MemoryView\":458\n *         dst_slice = get_slice_from_memview(dst, &tmp_slice)\n * \n *         if <size_t>self.view.itemsize > sizeof(array):             # <<<<<<<<<<<<<<\n *             tmp = PyMem_Malloc(self.view.itemsize)\n *             if tmp == NULL:\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":464\n *             item = tmp\n *         else:\n *             item = <void *> array             # <<<<<<<<<<<<<<\n * \n *         try:\n */\n  /*else*/ {\n    __pyx_v_item = ((void *)__pyx_v_array);\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":466\n *             item = <void *> array\n * \n *         try:             # <<<<<<<<<<<<<<\n *             if self.dtype_is_object:\n *                 (<PyObject **> item)[0] = <PyObject *> value\n */\n  /*try:*/ {\n\n    /* \"View.MemoryView\":467\n * \n *         try:\n *             if self.dtype_is_object:             # <<<<<<<<<<<<<<\n *                 (<PyObject **> item)[0] = <PyObject *> value\n *             else:\n */\n    __pyx_t_2 = (__pyx_v_self->dtype_is_object != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":468\n *         try:\n *             if self.dtype_is_object:\n *                 (<PyObject **> item)[0] = <PyObject *> value             # <<<<<<<<<<<<<<\n *             else:\n *                 self.assign_item_from_object(<char *> item, value)\n */\n      (((PyObject **)__pyx_v_item)[0]) = ((PyObject *)__pyx_v_value);\n\n      /* \"View.MemoryView\":467\n * \n *         try:\n *             if self.dtype_is_object:             # <<<<<<<<<<<<<<\n *                 (<PyObject **> item)[0] = <PyObject *> value\n *             else:\n */\n      goto __pyx_L8;\n    }\n\n    /* \"View.MemoryView\":470\n *                 (<PyObject **> item)[0] = <PyObject *> value\n *             else:\n *                 self.assign_item_from_object(<char *> item, value)             # <<<<<<<<<<<<<<\n * \n * \n */\n    /*else*/ {\n      __pyx_t_3 = ((struct __pyx_vtabstruct_memoryview *)__pyx_v_self->__pyx_vtab)->assign_item_from_object(__pyx_v_self, ((char *)__pyx_v_item), __pyx_v_value); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 470, __pyx_L6_error)\n      __Pyx_GOTREF(__pyx_t_3);\n      __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n    }\n    __pyx_L8:;\n\n    /* \"View.MemoryView\":474\n * \n * \n *             if self.view.suboffsets != NULL:             # <<<<<<<<<<<<<<\n *                 assert_direct_dimensions(self.view.suboffsets, self.view.ndim)\n *             slice_assign_scalar(dst_slice, dst.view.ndim, self.view.itemsize,\n */\n    __pyx_t_2 = ((__pyx_v_self->view.suboffsets != NULL) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":475\n * \n *             if self.view.suboffsets != NULL:\n *                 assert_direct_dimensions(self.view.suboffsets, self.view.ndim)             # <<<<<<<<<<<<<<\n *             slice_assign_scalar(dst_slice, dst.view.ndim, self.view.itemsize,\n *                                 item, self.dtype_is_object)\n */\n      __pyx_t_3 = assert_direct_dimensions(__pyx_v_self->view.suboffsets, __pyx_v_self->view.ndim); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 475, __pyx_L6_error)\n      __Pyx_GOTREF(__pyx_t_3);\n      __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n\n      /* \"View.MemoryView\":474\n * \n * \n *             if self.view.suboffsets != NULL:             # <<<<<<<<<<<<<<\n *                 assert_direct_dimensions(self.view.suboffsets, self.view.ndim)\n *             slice_assign_scalar(dst_slice, dst.view.ndim, self.view.itemsize,\n */\n    }\n\n    /* \"View.MemoryView\":476\n *             if self.view.suboffsets != NULL:\n *                 assert_direct_dimensions(self.view.suboffsets, self.view.ndim)\n *             slice_assign_scalar(dst_slice, dst.view.ndim, self.view.itemsize,             # <<<<<<<<<<<<<<\n *                                 item, self.dtype_is_object)\n *         finally:\n */\n    __pyx_memoryview_slice_assign_scalar(__pyx_v_dst_slice, __pyx_v_dst->view.ndim, __pyx_v_self->view.itemsize, __pyx_v_item, __pyx_v_self->dtype_is_object);\n  }\n\n  /* \"View.MemoryView\":479\n *                                 item, self.dtype_is_object)\n *         finally:\n *             PyMem_Free(tmp)             # <<<<<<<<<<<<<<\n * \n *     cdef setitem_indexed(self, index, value):\n */\n  /*finally:*/ {\n    /*normal exit:*/{\n      PyMem_Free(__pyx_v_tmp);\n      goto __pyx_L7;\n    }\n    __pyx_L6_error:;\n    /*exception exit:*/{\n      __Pyx_PyThreadState_declare\n      __Pyx_PyThreadState_assign\n      __pyx_t_7 = 0; __pyx_t_8 = 0; __pyx_t_9 = 0; __pyx_t_10 = 0; __pyx_t_11 = 0; __pyx_t_12 = 0;\n      __Pyx_XDECREF(__pyx_t_3); __pyx_t_3 = 0;\n      if (PY_MAJOR_VERSION >= 3) __Pyx_ExceptionSwap(&__pyx_t_10, &__pyx_t_11, &__pyx_t_12);\n      if ((PY_MAJOR_VERSION < 3) || unlikely(__Pyx_GetException(&__pyx_t_7, &__pyx_t_8, &__pyx_t_9) < 0)) __Pyx_ErrFetch(&__pyx_t_7, &__pyx_t_8, &__pyx_t_9);\n      __Pyx_XGOTREF(__pyx_t_7);\n      __Pyx_XGOTREF(__pyx_t_8);\n      __Pyx_XGOTREF(__pyx_t_9);\n      __Pyx_XGOTREF(__pyx_t_10);\n      __Pyx_XGOTREF(__pyx_t_11);\n      __Pyx_XGOTREF(__pyx_t_12);\n      __pyx_t_4 = __pyx_lineno; __pyx_t_5 = __pyx_clineno; __pyx_t_6 = __pyx_filename;\n      {\n        PyMem_Free(__pyx_v_tmp);\n      }\n      if (PY_MAJOR_VERSION >= 3) {\n        __Pyx_XGIVEREF(__pyx_t_10);\n        __Pyx_XGIVEREF(__pyx_t_11);\n        __Pyx_XGIVEREF(__pyx_t_12);\n        __Pyx_ExceptionReset(__pyx_t_10, __pyx_t_11, __pyx_t_12);\n      }\n      __Pyx_XGIVEREF(__pyx_t_7);\n      __Pyx_XGIVEREF(__pyx_t_8);\n      __Pyx_XGIVEREF(__pyx_t_9);\n      __Pyx_ErrRestore(__pyx_t_7, __pyx_t_8, __pyx_t_9);\n      __pyx_t_7 = 0; __pyx_t_8 = 0; __pyx_t_9 = 0; __pyx_t_10 = 0; __pyx_t_11 = 0; __pyx_t_12 = 0;\n      __pyx_lineno = __pyx_t_4; __pyx_clineno = __pyx_t_5; __pyx_filename = __pyx_t_6;\n      goto __pyx_L1_error;\n    }\n    __pyx_L7:;\n  }\n\n  /* \"View.MemoryView\":449\n *                                  src.ndim, dst.ndim, self.dtype_is_object)\n * \n *     cdef setitem_slice_assign_scalar(self, memoryview dst, value):             # <<<<<<<<<<<<<<\n *         cdef int array[128]\n *         cdef void *tmp = NULL\n */\n\n  /* function exit code */\n  __pyx_r = Py_None; __Pyx_INCREF(Py_None);\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.setitem_slice_assign_scalar\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":481\n *             PyMem_Free(tmp)\n * \n *     cdef setitem_indexed(self, index, value):             # <<<<<<<<<<<<<<\n *         cdef char *itemp = self.get_item_pointer(index)\n *         self.assign_item_from_object(itemp, value)\n */\n\nstatic PyObject *__pyx_memoryview_setitem_indexed(struct __pyx_memoryview_obj *__pyx_v_self, PyObject *__pyx_v_index, PyObject *__pyx_v_value) {\n  char *__pyx_v_itemp;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  char *__pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"setitem_indexed\", 0);\n\n  /* \"View.MemoryView\":482\n * \n *     cdef setitem_indexed(self, index, value):\n *         cdef char *itemp = self.get_item_pointer(index)             # <<<<<<<<<<<<<<\n *         self.assign_item_from_object(itemp, value)\n * \n */\n  __pyx_t_1 = ((struct __pyx_vtabstruct_memoryview *)__pyx_v_self->__pyx_vtab)->get_item_pointer(__pyx_v_self, __pyx_v_index); if (unlikely(__pyx_t_1 == ((char *)NULL))) __PYX_ERR(2, 482, __pyx_L1_error)\n  __pyx_v_itemp = __pyx_t_1;\n\n  /* \"View.MemoryView\":483\n *     cdef setitem_indexed(self, index, value):\n *         cdef char *itemp = self.get_item_pointer(index)\n *         self.assign_item_from_object(itemp, value)             # <<<<<<<<<<<<<<\n * \n *     cdef convert_item_to_object(self, char *itemp):\n */\n  __pyx_t_2 = ((struct __pyx_vtabstruct_memoryview *)__pyx_v_self->__pyx_vtab)->assign_item_from_object(__pyx_v_self, __pyx_v_itemp, __pyx_v_value); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 483, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n\n  /* \"View.MemoryView\":481\n *             PyMem_Free(tmp)\n * \n *     cdef setitem_indexed(self, index, value):             # <<<<<<<<<<<<<<\n *         cdef char *itemp = self.get_item_pointer(index)\n *         self.assign_item_from_object(itemp, value)\n */\n\n  /* function exit code */\n  __pyx_r = Py_None; __Pyx_INCREF(Py_None);\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.setitem_indexed\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":485\n *         self.assign_item_from_object(itemp, value)\n * \n *     cdef convert_item_to_object(self, char *itemp):             # <<<<<<<<<<<<<<\n *         \"\"\"Only used if instantiated manually by the user, or if Cython doesn't\n *         know how to convert the type\"\"\"\n */\n\nstatic PyObject *__pyx_memoryview_convert_item_to_object(struct __pyx_memoryview_obj *__pyx_v_self, char *__pyx_v_itemp) {\n  PyObject *__pyx_v_struct = NULL;\n  PyObject *__pyx_v_bytesitem = 0;\n  PyObject *__pyx_v_result = NULL;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  PyObject *__pyx_t_4 = NULL;\n  PyObject *__pyx_t_5 = NULL;\n  PyObject *__pyx_t_6 = NULL;\n  PyObject *__pyx_t_7 = NULL;\n  int __pyx_t_8;\n  PyObject *__pyx_t_9 = NULL;\n  size_t __pyx_t_10;\n  int __pyx_t_11;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"convert_item_to_object\", 0);\n\n  /* \"View.MemoryView\":488\n *         \"\"\"Only used if instantiated manually by the user, or if Cython doesn't\n *         know how to convert the type\"\"\"\n *         import struct             # <<<<<<<<<<<<<<\n *         cdef bytes bytesitem\n * \n */\n  __pyx_t_1 = __Pyx_Import(__pyx_n_s_struct, 0, 0); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 488, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_v_struct = __pyx_t_1;\n  __pyx_t_1 = 0;\n\n  /* \"View.MemoryView\":491\n *         cdef bytes bytesitem\n * \n *         bytesitem = itemp[:self.view.itemsize]             # <<<<<<<<<<<<<<\n *         try:\n *             result = struct.unpack(self.view.format, bytesitem)\n */\n  __pyx_t_1 = __Pyx_PyBytes_FromStringAndSize(__pyx_v_itemp + 0, __pyx_v_self->view.itemsize - 0); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 491, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_v_bytesitem = ((PyObject*)__pyx_t_1);\n  __pyx_t_1 = 0;\n\n  /* \"View.MemoryView\":492\n * \n *         bytesitem = itemp[:self.view.itemsize]\n *         try:             # <<<<<<<<<<<<<<\n *             result = struct.unpack(self.view.format, bytesitem)\n *         except struct.error:\n */\n  {\n    __Pyx_PyThreadState_declare\n    __Pyx_PyThreadState_assign\n    __Pyx_ExceptionSave(&__pyx_t_2, &__pyx_t_3, &__pyx_t_4);\n    __Pyx_XGOTREF(__pyx_t_2);\n    __Pyx_XGOTREF(__pyx_t_3);\n    __Pyx_XGOTREF(__pyx_t_4);\n    /*try:*/ {\n\n      /* \"View.MemoryView\":493\n *         bytesitem = itemp[:self.view.itemsize]\n *         try:\n *             result = struct.unpack(self.view.format, bytesitem)             # <<<<<<<<<<<<<<\n *         except struct.error:\n *             raise ValueError(\"Unable to convert item to object\")\n */\n      __pyx_t_5 = __Pyx_PyObject_GetAttrStr(__pyx_v_struct, __pyx_n_s_unpack); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 493, __pyx_L3_error)\n      __Pyx_GOTREF(__pyx_t_5);\n      __pyx_t_6 = __Pyx_PyBytes_FromString(__pyx_v_self->view.format); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 493, __pyx_L3_error)\n      __Pyx_GOTREF(__pyx_t_6);\n      __pyx_t_7 = NULL;\n      __pyx_t_8 = 0;\n      if (CYTHON_UNPACK_METHODS && likely(PyMethod_Check(__pyx_t_5))) {\n        __pyx_t_7 = PyMethod_GET_SELF(__pyx_t_5);\n        if (likely(__pyx_t_7)) {\n          PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_5);\n          __Pyx_INCREF(__pyx_t_7);\n          __Pyx_INCREF(function);\n          __Pyx_DECREF_SET(__pyx_t_5, function);\n          __pyx_t_8 = 1;\n        }\n      }\n      #if CYTHON_FAST_PYCALL\n      if (PyFunction_Check(__pyx_t_5)) {\n        PyObject *__pyx_temp[3] = {__pyx_t_7, __pyx_t_6, __pyx_v_bytesitem};\n        __pyx_t_1 = __Pyx_PyFunction_FastCall(__pyx_t_5, __pyx_temp+1-__pyx_t_8, 2+__pyx_t_8); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 493, __pyx_L3_error)\n        __Pyx_XDECREF(__pyx_t_7); __pyx_t_7 = 0;\n        __Pyx_GOTREF(__pyx_t_1);\n        __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;\n      } else\n      #endif\n      #if CYTHON_FAST_PYCCALL\n      if (__Pyx_PyFastCFunction_Check(__pyx_t_5)) {\n        PyObject *__pyx_temp[3] = {__pyx_t_7, __pyx_t_6, __pyx_v_bytesitem};\n        __pyx_t_1 = __Pyx_PyCFunction_FastCall(__pyx_t_5, __pyx_temp+1-__pyx_t_8, 2+__pyx_t_8); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 493, __pyx_L3_error)\n        __Pyx_XDECREF(__pyx_t_7); __pyx_t_7 = 0;\n        __Pyx_GOTREF(__pyx_t_1);\n        __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;\n      } else\n      #endif\n      {\n        __pyx_t_9 = PyTuple_New(2+__pyx_t_8); if (unlikely(!__pyx_t_9)) __PYX_ERR(2, 493, __pyx_L3_error)\n        __Pyx_GOTREF(__pyx_t_9);\n        if (__pyx_t_7) {\n          __Pyx_GIVEREF(__pyx_t_7); PyTuple_SET_ITEM(__pyx_t_9, 0, __pyx_t_7); __pyx_t_7 = NULL;\n        }\n        __Pyx_GIVEREF(__pyx_t_6);\n        PyTuple_SET_ITEM(__pyx_t_9, 0+__pyx_t_8, __pyx_t_6);\n        __Pyx_INCREF(__pyx_v_bytesitem);\n        __Pyx_GIVEREF(__pyx_v_bytesitem);\n        PyTuple_SET_ITEM(__pyx_t_9, 1+__pyx_t_8, __pyx_v_bytesitem);\n        __pyx_t_6 = 0;\n        __pyx_t_1 = __Pyx_PyObject_Call(__pyx_t_5, __pyx_t_9, NULL); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 493, __pyx_L3_error)\n        __Pyx_GOTREF(__pyx_t_1);\n        __Pyx_DECREF(__pyx_t_9); __pyx_t_9 = 0;\n      }\n      __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0;\n      __pyx_v_result = __pyx_t_1;\n      __pyx_t_1 = 0;\n\n      /* \"View.MemoryView\":492\n * \n *         bytesitem = itemp[:self.view.itemsize]\n *         try:             # <<<<<<<<<<<<<<\n *             result = struct.unpack(self.view.format, bytesitem)\n *         except struct.error:\n */\n    }\n\n    /* \"View.MemoryView\":497\n *             raise ValueError(\"Unable to convert item to object\")\n *         else:\n *             if len(self.view.format) == 1:             # <<<<<<<<<<<<<<\n *                 return result[0]\n *             return result\n */\n    /*else:*/ {\n      __pyx_t_10 = strlen(__pyx_v_self->view.format); \n      __pyx_t_11 = ((__pyx_t_10 == 1) != 0);\n      if (__pyx_t_11) {\n\n        /* \"View.MemoryView\":498\n *         else:\n *             if len(self.view.format) == 1:\n *                 return result[0]             # <<<<<<<<<<<<<<\n *             return result\n * \n */\n        __Pyx_XDECREF(__pyx_r);\n        __pyx_t_1 = __Pyx_GetItemInt(__pyx_v_result, 0, long, 1, __Pyx_PyInt_From_long, 0, 0, 1); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 498, __pyx_L5_except_error)\n        __Pyx_GOTREF(__pyx_t_1);\n        __pyx_r = __pyx_t_1;\n        __pyx_t_1 = 0;\n        goto __pyx_L6_except_return;\n\n        /* \"View.MemoryView\":497\n *             raise ValueError(\"Unable to convert item to object\")\n *         else:\n *             if len(self.view.format) == 1:             # <<<<<<<<<<<<<<\n *                 return result[0]\n *             return result\n */\n      }\n\n      /* \"View.MemoryView\":499\n *             if len(self.view.format) == 1:\n *                 return result[0]\n *             return result             # <<<<<<<<<<<<<<\n * \n *     cdef assign_item_from_object(self, char *itemp, object value):\n */\n      __Pyx_XDECREF(__pyx_r);\n      __Pyx_INCREF(__pyx_v_result);\n      __pyx_r = __pyx_v_result;\n      goto __pyx_L6_except_return;\n    }\n    __pyx_L3_error:;\n    __Pyx_XDECREF(__pyx_t_1); __pyx_t_1 = 0;\n    __Pyx_XDECREF(__pyx_t_5); __pyx_t_5 = 0;\n    __Pyx_XDECREF(__pyx_t_6); __pyx_t_6 = 0;\n    __Pyx_XDECREF(__pyx_t_7); __pyx_t_7 = 0;\n    __Pyx_XDECREF(__pyx_t_9); __pyx_t_9 = 0;\n\n    /* \"View.MemoryView\":494\n *         try:\n *             result = struct.unpack(self.view.format, bytesitem)\n *         except struct.error:             # <<<<<<<<<<<<<<\n *             raise ValueError(\"Unable to convert item to object\")\n *         else:\n */\n    __Pyx_ErrFetch(&__pyx_t_1, &__pyx_t_5, &__pyx_t_9);\n    __pyx_t_6 = __Pyx_PyObject_GetAttrStr(__pyx_v_struct, __pyx_n_s_error); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 494, __pyx_L5_except_error)\n    __Pyx_GOTREF(__pyx_t_6);\n    __pyx_t_8 = __Pyx_PyErr_GivenExceptionMatches(__pyx_t_1, __pyx_t_6);\n    __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;\n    __Pyx_ErrRestore(__pyx_t_1, __pyx_t_5, __pyx_t_9);\n    __pyx_t_1 = 0; __pyx_t_5 = 0; __pyx_t_9 = 0;\n    if (__pyx_t_8) {\n      __Pyx_AddTraceback(\"View.MemoryView.memoryview.convert_item_to_object\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n      if (__Pyx_GetException(&__pyx_t_9, &__pyx_t_5, &__pyx_t_1) < 0) __PYX_ERR(2, 494, __pyx_L5_except_error)\n      __Pyx_GOTREF(__pyx_t_9);\n      __Pyx_GOTREF(__pyx_t_5);\n      __Pyx_GOTREF(__pyx_t_1);\n\n      /* \"View.MemoryView\":495\n *             result = struct.unpack(self.view.format, bytesitem)\n *         except struct.error:\n *             raise ValueError(\"Unable to convert item to object\")             # <<<<<<<<<<<<<<\n *         else:\n *             if len(self.view.format) == 1:\n */\n      __pyx_t_6 = __Pyx_PyObject_Call(__pyx_builtin_ValueError, __pyx_tuple__12, NULL); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 495, __pyx_L5_except_error)\n      __Pyx_GOTREF(__pyx_t_6);\n      __Pyx_Raise(__pyx_t_6, 0, 0, 0);\n      __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;\n      __PYX_ERR(2, 495, __pyx_L5_except_error)\n    }\n    goto __pyx_L5_except_error;\n    __pyx_L5_except_error:;\n\n    /* \"View.MemoryView\":492\n * \n *         bytesitem = itemp[:self.view.itemsize]\n *         try:             # <<<<<<<<<<<<<<\n *             result = struct.unpack(self.view.format, bytesitem)\n *         except struct.error:\n */\n    __Pyx_XGIVEREF(__pyx_t_2);\n    __Pyx_XGIVEREF(__pyx_t_3);\n    __Pyx_XGIVEREF(__pyx_t_4);\n    __Pyx_ExceptionReset(__pyx_t_2, __pyx_t_3, __pyx_t_4);\n    goto __pyx_L1_error;\n    __pyx_L6_except_return:;\n    __Pyx_XGIVEREF(__pyx_t_2);\n    __Pyx_XGIVEREF(__pyx_t_3);\n    __Pyx_XGIVEREF(__pyx_t_4);\n    __Pyx_ExceptionReset(__pyx_t_2, __pyx_t_3, __pyx_t_4);\n    goto __pyx_L0;\n  }\n\n  /* \"View.MemoryView\":485\n *         self.assign_item_from_object(itemp, value)\n * \n *     cdef convert_item_to_object(self, char *itemp):             # <<<<<<<<<<<<<<\n *         \"\"\"Only used if instantiated manually by the user, or if Cython doesn't\n *         know how to convert the type\"\"\"\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_XDECREF(__pyx_t_6);\n  __Pyx_XDECREF(__pyx_t_7);\n  __Pyx_XDECREF(__pyx_t_9);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.convert_item_to_object\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XDECREF(__pyx_v_struct);\n  __Pyx_XDECREF(__pyx_v_bytesitem);\n  __Pyx_XDECREF(__pyx_v_result);\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":501\n *             return result\n * \n *     cdef assign_item_from_object(self, char *itemp, object value):             # <<<<<<<<<<<<<<\n *         \"\"\"Only used if instantiated manually by the user, or if Cython doesn't\n *         know how to convert the type\"\"\"\n */\n\nstatic PyObject *__pyx_memoryview_assign_item_from_object(struct __pyx_memoryview_obj *__pyx_v_self, char *__pyx_v_itemp, PyObject *__pyx_v_value) {\n  PyObject *__pyx_v_struct = NULL;\n  char __pyx_v_c;\n  PyObject *__pyx_v_bytesvalue = 0;\n  Py_ssize_t __pyx_v_i;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_t_2;\n  int __pyx_t_3;\n  PyObject *__pyx_t_4 = NULL;\n  PyObject *__pyx_t_5 = NULL;\n  PyObject *__pyx_t_6 = NULL;\n  int __pyx_t_7;\n  PyObject *__pyx_t_8 = NULL;\n  Py_ssize_t __pyx_t_9;\n  PyObject *__pyx_t_10 = NULL;\n  char *__pyx_t_11;\n  char *__pyx_t_12;\n  char *__pyx_t_13;\n  char *__pyx_t_14;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"assign_item_from_object\", 0);\n\n  /* \"View.MemoryView\":504\n *         \"\"\"Only used if instantiated manually by the user, or if Cython doesn't\n *         know how to convert the type\"\"\"\n *         import struct             # <<<<<<<<<<<<<<\n *         cdef char c\n *         cdef bytes bytesvalue\n */\n  __pyx_t_1 = __Pyx_Import(__pyx_n_s_struct, 0, 0); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 504, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_v_struct = __pyx_t_1;\n  __pyx_t_1 = 0;\n\n  /* \"View.MemoryView\":509\n *         cdef Py_ssize_t i\n * \n *         if isinstance(value, tuple):             # <<<<<<<<<<<<<<\n *             bytesvalue = struct.pack(self.view.format, *value)\n *         else:\n */\n  __pyx_t_2 = PyTuple_Check(__pyx_v_value); \n  __pyx_t_3 = (__pyx_t_2 != 0);\n  if (__pyx_t_3) {\n\n    /* \"View.MemoryView\":510\n * \n *         if isinstance(value, tuple):\n *             bytesvalue = struct.pack(self.view.format, *value)             # <<<<<<<<<<<<<<\n *         else:\n *             bytesvalue = struct.pack(self.view.format, value)\n */\n    __pyx_t_1 = __Pyx_PyObject_GetAttrStr(__pyx_v_struct, __pyx_n_s_pack); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 510, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_1);\n    __pyx_t_4 = __Pyx_PyBytes_FromString(__pyx_v_self->view.format); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 510, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __pyx_t_5 = PyTuple_New(1); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 510, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_5);\n    __Pyx_GIVEREF(__pyx_t_4);\n    PyTuple_SET_ITEM(__pyx_t_5, 0, __pyx_t_4);\n    __pyx_t_4 = 0;\n    __pyx_t_4 = __Pyx_PySequence_Tuple(__pyx_v_value); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 510, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __pyx_t_6 = PyNumber_Add(__pyx_t_5, __pyx_t_4); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 510, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_6);\n    __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0;\n    __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;\n    __pyx_t_4 = __Pyx_PyObject_Call(__pyx_t_1, __pyx_t_6, NULL); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 510, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n    __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;\n    if (!(likely(PyBytes_CheckExact(__pyx_t_4))||((__pyx_t_4) == Py_None)||(PyErr_Format(PyExc_TypeError, \"Expected %.16s, got %.200s\", \"bytes\", Py_TYPE(__pyx_t_4)->tp_name), 0))) __PYX_ERR(2, 510, __pyx_L1_error)\n    __pyx_v_bytesvalue = ((PyObject*)__pyx_t_4);\n    __pyx_t_4 = 0;\n\n    /* \"View.MemoryView\":509\n *         cdef Py_ssize_t i\n * \n *         if isinstance(value, tuple):             # <<<<<<<<<<<<<<\n *             bytesvalue = struct.pack(self.view.format, *value)\n *         else:\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":512\n *             bytesvalue = struct.pack(self.view.format, *value)\n *         else:\n *             bytesvalue = struct.pack(self.view.format, value)             # <<<<<<<<<<<<<<\n * \n *         for i, c in enumerate(bytesvalue):\n */\n  /*else*/ {\n    __pyx_t_6 = __Pyx_PyObject_GetAttrStr(__pyx_v_struct, __pyx_n_s_pack); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 512, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_6);\n    __pyx_t_1 = __Pyx_PyBytes_FromString(__pyx_v_self->view.format); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 512, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_1);\n    __pyx_t_5 = NULL;\n    __pyx_t_7 = 0;\n    if (CYTHON_UNPACK_METHODS && likely(PyMethod_Check(__pyx_t_6))) {\n      __pyx_t_5 = PyMethod_GET_SELF(__pyx_t_6);\n      if (likely(__pyx_t_5)) {\n        PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_6);\n        __Pyx_INCREF(__pyx_t_5);\n        __Pyx_INCREF(function);\n        __Pyx_DECREF_SET(__pyx_t_6, function);\n        __pyx_t_7 = 1;\n      }\n    }\n    #if CYTHON_FAST_PYCALL\n    if (PyFunction_Check(__pyx_t_6)) {\n      PyObject *__pyx_temp[3] = {__pyx_t_5, __pyx_t_1, __pyx_v_value};\n      __pyx_t_4 = __Pyx_PyFunction_FastCall(__pyx_t_6, __pyx_temp+1-__pyx_t_7, 2+__pyx_t_7); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 512, __pyx_L1_error)\n      __Pyx_XDECREF(__pyx_t_5); __pyx_t_5 = 0;\n      __Pyx_GOTREF(__pyx_t_4);\n      __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n    } else\n    #endif\n    #if CYTHON_FAST_PYCCALL\n    if (__Pyx_PyFastCFunction_Check(__pyx_t_6)) {\n      PyObject *__pyx_temp[3] = {__pyx_t_5, __pyx_t_1, __pyx_v_value};\n      __pyx_t_4 = __Pyx_PyCFunction_FastCall(__pyx_t_6, __pyx_temp+1-__pyx_t_7, 2+__pyx_t_7); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 512, __pyx_L1_error)\n      __Pyx_XDECREF(__pyx_t_5); __pyx_t_5 = 0;\n      __Pyx_GOTREF(__pyx_t_4);\n      __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n    } else\n    #endif\n    {\n      __pyx_t_8 = PyTuple_New(2+__pyx_t_7); if (unlikely(!__pyx_t_8)) __PYX_ERR(2, 512, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_8);\n      if (__pyx_t_5) {\n        __Pyx_GIVEREF(__pyx_t_5); PyTuple_SET_ITEM(__pyx_t_8, 0, __pyx_t_5); __pyx_t_5 = NULL;\n      }\n      __Pyx_GIVEREF(__pyx_t_1);\n      PyTuple_SET_ITEM(__pyx_t_8, 0+__pyx_t_7, __pyx_t_1);\n      __Pyx_INCREF(__pyx_v_value);\n      __Pyx_GIVEREF(__pyx_v_value);\n      PyTuple_SET_ITEM(__pyx_t_8, 1+__pyx_t_7, __pyx_v_value);\n      __pyx_t_1 = 0;\n      __pyx_t_4 = __Pyx_PyObject_Call(__pyx_t_6, __pyx_t_8, NULL); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 512, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_4);\n      __Pyx_DECREF(__pyx_t_8); __pyx_t_8 = 0;\n    }\n    __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;\n    if (!(likely(PyBytes_CheckExact(__pyx_t_4))||((__pyx_t_4) == Py_None)||(PyErr_Format(PyExc_TypeError, \"Expected %.16s, got %.200s\", \"bytes\", Py_TYPE(__pyx_t_4)->tp_name), 0))) __PYX_ERR(2, 512, __pyx_L1_error)\n    __pyx_v_bytesvalue = ((PyObject*)__pyx_t_4);\n    __pyx_t_4 = 0;\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":514\n *             bytesvalue = struct.pack(self.view.format, value)\n * \n *         for i, c in enumerate(bytesvalue):             # <<<<<<<<<<<<<<\n *             itemp[i] = c\n * \n */\n  __pyx_t_9 = 0;\n  if (unlikely(__pyx_v_bytesvalue == Py_None)) {\n    PyErr_SetString(PyExc_TypeError, \"'NoneType' is not iterable\");\n    __PYX_ERR(2, 514, __pyx_L1_error)\n  }\n  __Pyx_INCREF(__pyx_v_bytesvalue);\n  __pyx_t_10 = __pyx_v_bytesvalue;\n  __pyx_t_12 = PyBytes_AS_STRING(__pyx_t_10);\n  __pyx_t_13 = (__pyx_t_12 + PyBytes_GET_SIZE(__pyx_t_10));\n  for (__pyx_t_14 = __pyx_t_12; __pyx_t_14 < __pyx_t_13; __pyx_t_14++) {\n    __pyx_t_11 = __pyx_t_14;\n    __pyx_v_c = (__pyx_t_11[0]);\n\n    /* \"View.MemoryView\":515\n * \n *         for i, c in enumerate(bytesvalue):\n *             itemp[i] = c             # <<<<<<<<<<<<<<\n * \n *     @cname('getbuffer')\n */\n    __pyx_v_i = __pyx_t_9;\n\n    /* \"View.MemoryView\":514\n *             bytesvalue = struct.pack(self.view.format, value)\n * \n *         for i, c in enumerate(bytesvalue):             # <<<<<<<<<<<<<<\n *             itemp[i] = c\n * \n */\n    __pyx_t_9 = (__pyx_t_9 + 1);\n\n    /* \"View.MemoryView\":515\n * \n *         for i, c in enumerate(bytesvalue):\n *             itemp[i] = c             # <<<<<<<<<<<<<<\n * \n *     @cname('getbuffer')\n */\n    (__pyx_v_itemp[__pyx_v_i]) = __pyx_v_c;\n  }\n  __Pyx_DECREF(__pyx_t_10); __pyx_t_10 = 0;\n\n  /* \"View.MemoryView\":501\n *             return result\n * \n *     cdef assign_item_from_object(self, char *itemp, object value):             # <<<<<<<<<<<<<<\n *         \"\"\"Only used if instantiated manually by the user, or if Cython doesn't\n *         know how to convert the type\"\"\"\n */\n\n  /* function exit code */\n  __pyx_r = Py_None; __Pyx_INCREF(Py_None);\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_4);\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_XDECREF(__pyx_t_6);\n  __Pyx_XDECREF(__pyx_t_8);\n  __Pyx_XDECREF(__pyx_t_10);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.assign_item_from_object\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XDECREF(__pyx_v_struct);\n  __Pyx_XDECREF(__pyx_v_bytesvalue);\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":518\n * \n *     @cname('getbuffer')\n *     def __getbuffer__(self, Py_buffer *info, int flags):             # <<<<<<<<<<<<<<\n *         if flags & PyBUF_WRITABLE and self.view.readonly:\n *             raise ValueError(\"Cannot create writable memory view from read-only memoryview\")\n */\n\n/* Python wrapper */\nstatic CYTHON_UNUSED int __pyx_memoryview_getbuffer(PyObject *__pyx_v_self, Py_buffer *__pyx_v_info, int __pyx_v_flags); /*proto*/\nstatic CYTHON_UNUSED int __pyx_memoryview_getbuffer(PyObject *__pyx_v_self, Py_buffer *__pyx_v_info, int __pyx_v_flags) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__getbuffer__ (wrapper)\", 0);\n  __pyx_r = __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_8__getbuffer__(((struct __pyx_memoryview_obj *)__pyx_v_self), ((Py_buffer *)__pyx_v_info), ((int)__pyx_v_flags));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic int __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_8__getbuffer__(struct __pyx_memoryview_obj *__pyx_v_self, Py_buffer *__pyx_v_info, int __pyx_v_flags) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  int __pyx_t_2;\n  PyObject *__pyx_t_3 = NULL;\n  Py_ssize_t *__pyx_t_4;\n  char *__pyx_t_5;\n  void *__pyx_t_6;\n  int __pyx_t_7;\n  Py_ssize_t __pyx_t_8;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  if (__pyx_v_info == NULL) {\n    PyErr_SetString(PyExc_BufferError, \"PyObject_GetBuffer: view==NULL argument is obsolete\");\n    return -1;\n  }\n  __Pyx_RefNannySetupContext(\"__getbuffer__\", 0);\n  __pyx_v_info->obj = Py_None; __Pyx_INCREF(Py_None);\n  __Pyx_GIVEREF(__pyx_v_info->obj);\n\n  /* \"View.MemoryView\":519\n *     @cname('getbuffer')\n *     def __getbuffer__(self, Py_buffer *info, int flags):\n *         if flags & PyBUF_WRITABLE and self.view.readonly:             # <<<<<<<<<<<<<<\n *             raise ValueError(\"Cannot create writable memory view from read-only memoryview\")\n * \n */\n  __pyx_t_2 = ((__pyx_v_flags & PyBUF_WRITABLE) != 0);\n  if (__pyx_t_2) {\n  } else {\n    __pyx_t_1 = __pyx_t_2;\n    goto __pyx_L4_bool_binop_done;\n  }\n  __pyx_t_2 = (__pyx_v_self->view.readonly != 0);\n  __pyx_t_1 = __pyx_t_2;\n  __pyx_L4_bool_binop_done:;\n  if (unlikely(__pyx_t_1)) {\n\n    /* \"View.MemoryView\":520\n *     def __getbuffer__(self, Py_buffer *info, int flags):\n *         if flags & PyBUF_WRITABLE and self.view.readonly:\n *             raise ValueError(\"Cannot create writable memory view from read-only memoryview\")             # <<<<<<<<<<<<<<\n * \n *         if flags & PyBUF_ND:\n */\n    __pyx_t_3 = __Pyx_PyObject_Call(__pyx_builtin_ValueError, __pyx_tuple__13, NULL); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 520, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_Raise(__pyx_t_3, 0, 0, 0);\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n    __PYX_ERR(2, 520, __pyx_L1_error)\n\n    /* \"View.MemoryView\":519\n *     @cname('getbuffer')\n *     def __getbuffer__(self, Py_buffer *info, int flags):\n *         if flags & PyBUF_WRITABLE and self.view.readonly:             # <<<<<<<<<<<<<<\n *             raise ValueError(\"Cannot create writable memory view from read-only memoryview\")\n * \n */\n  }\n\n  /* \"View.MemoryView\":522\n *             raise ValueError(\"Cannot create writable memory view from read-only memoryview\")\n * \n *         if flags & PyBUF_ND:             # <<<<<<<<<<<<<<\n *             info.shape = self.view.shape\n *         else:\n */\n  __pyx_t_1 = ((__pyx_v_flags & PyBUF_ND) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":523\n * \n *         if flags & PyBUF_ND:\n *             info.shape = self.view.shape             # <<<<<<<<<<<<<<\n *         else:\n *             info.shape = NULL\n */\n    __pyx_t_4 = __pyx_v_self->view.shape;\n    __pyx_v_info->shape = __pyx_t_4;\n\n    /* \"View.MemoryView\":522\n *             raise ValueError(\"Cannot create writable memory view from read-only memoryview\")\n * \n *         if flags & PyBUF_ND:             # <<<<<<<<<<<<<<\n *             info.shape = self.view.shape\n *         else:\n */\n    goto __pyx_L6;\n  }\n\n  /* \"View.MemoryView\":525\n *             info.shape = self.view.shape\n *         else:\n *             info.shape = NULL             # <<<<<<<<<<<<<<\n * \n *         if flags & PyBUF_STRIDES:\n */\n  /*else*/ {\n    __pyx_v_info->shape = NULL;\n  }\n  __pyx_L6:;\n\n  /* \"View.MemoryView\":527\n *             info.shape = NULL\n * \n *         if flags & PyBUF_STRIDES:             # <<<<<<<<<<<<<<\n *             info.strides = self.view.strides\n *         else:\n */\n  __pyx_t_1 = ((__pyx_v_flags & PyBUF_STRIDES) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":528\n * \n *         if flags & PyBUF_STRIDES:\n *             info.strides = self.view.strides             # <<<<<<<<<<<<<<\n *         else:\n *             info.strides = NULL\n */\n    __pyx_t_4 = __pyx_v_self->view.strides;\n    __pyx_v_info->strides = __pyx_t_4;\n\n    /* \"View.MemoryView\":527\n *             info.shape = NULL\n * \n *         if flags & PyBUF_STRIDES:             # <<<<<<<<<<<<<<\n *             info.strides = self.view.strides\n *         else:\n */\n    goto __pyx_L7;\n  }\n\n  /* \"View.MemoryView\":530\n *             info.strides = self.view.strides\n *         else:\n *             info.strides = NULL             # <<<<<<<<<<<<<<\n * \n *         if flags & PyBUF_INDIRECT:\n */\n  /*else*/ {\n    __pyx_v_info->strides = NULL;\n  }\n  __pyx_L7:;\n\n  /* \"View.MemoryView\":532\n *             info.strides = NULL\n * \n *         if flags & PyBUF_INDIRECT:             # <<<<<<<<<<<<<<\n *             info.suboffsets = self.view.suboffsets\n *         else:\n */\n  __pyx_t_1 = ((__pyx_v_flags & PyBUF_INDIRECT) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":533\n * \n *         if flags & PyBUF_INDIRECT:\n *             info.suboffsets = self.view.suboffsets             # <<<<<<<<<<<<<<\n *         else:\n *             info.suboffsets = NULL\n */\n    __pyx_t_4 = __pyx_v_self->view.suboffsets;\n    __pyx_v_info->suboffsets = __pyx_t_4;\n\n    /* \"View.MemoryView\":532\n *             info.strides = NULL\n * \n *         if flags & PyBUF_INDIRECT:             # <<<<<<<<<<<<<<\n *             info.suboffsets = self.view.suboffsets\n *         else:\n */\n    goto __pyx_L8;\n  }\n\n  /* \"View.MemoryView\":535\n *             info.suboffsets = self.view.suboffsets\n *         else:\n *             info.suboffsets = NULL             # <<<<<<<<<<<<<<\n * \n *         if flags & PyBUF_FORMAT:\n */\n  /*else*/ {\n    __pyx_v_info->suboffsets = NULL;\n  }\n  __pyx_L8:;\n\n  /* \"View.MemoryView\":537\n *             info.suboffsets = NULL\n * \n *         if flags & PyBUF_FORMAT:             # <<<<<<<<<<<<<<\n *             info.format = self.view.format\n *         else:\n */\n  __pyx_t_1 = ((__pyx_v_flags & PyBUF_FORMAT) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":538\n * \n *         if flags & PyBUF_FORMAT:\n *             info.format = self.view.format             # <<<<<<<<<<<<<<\n *         else:\n *             info.format = NULL\n */\n    __pyx_t_5 = __pyx_v_self->view.format;\n    __pyx_v_info->format = __pyx_t_5;\n\n    /* \"View.MemoryView\":537\n *             info.suboffsets = NULL\n * \n *         if flags & PyBUF_FORMAT:             # <<<<<<<<<<<<<<\n *             info.format = self.view.format\n *         else:\n */\n    goto __pyx_L9;\n  }\n\n  /* \"View.MemoryView\":540\n *             info.format = self.view.format\n *         else:\n *             info.format = NULL             # <<<<<<<<<<<<<<\n * \n *         info.buf = self.view.buf\n */\n  /*else*/ {\n    __pyx_v_info->format = NULL;\n  }\n  __pyx_L9:;\n\n  /* \"View.MemoryView\":542\n *             info.format = NULL\n * \n *         info.buf = self.view.buf             # <<<<<<<<<<<<<<\n *         info.ndim = self.view.ndim\n *         info.itemsize = self.view.itemsize\n */\n  __pyx_t_6 = __pyx_v_self->view.buf;\n  __pyx_v_info->buf = __pyx_t_6;\n\n  /* \"View.MemoryView\":543\n * \n *         info.buf = self.view.buf\n *         info.ndim = self.view.ndim             # <<<<<<<<<<<<<<\n *         info.itemsize = self.view.itemsize\n *         info.len = self.view.len\n */\n  __pyx_t_7 = __pyx_v_self->view.ndim;\n  __pyx_v_info->ndim = __pyx_t_7;\n\n  /* \"View.MemoryView\":544\n *         info.buf = self.view.buf\n *         info.ndim = self.view.ndim\n *         info.itemsize = self.view.itemsize             # <<<<<<<<<<<<<<\n *         info.len = self.view.len\n *         info.readonly = self.view.readonly\n */\n  __pyx_t_8 = __pyx_v_self->view.itemsize;\n  __pyx_v_info->itemsize = __pyx_t_8;\n\n  /* \"View.MemoryView\":545\n *         info.ndim = self.view.ndim\n *         info.itemsize = self.view.itemsize\n *         info.len = self.view.len             # <<<<<<<<<<<<<<\n *         info.readonly = self.view.readonly\n *         info.obj = self\n */\n  __pyx_t_8 = __pyx_v_self->view.len;\n  __pyx_v_info->len = __pyx_t_8;\n\n  /* \"View.MemoryView\":546\n *         info.itemsize = self.view.itemsize\n *         info.len = self.view.len\n *         info.readonly = self.view.readonly             # <<<<<<<<<<<<<<\n *         info.obj = self\n * \n */\n  __pyx_t_1 = __pyx_v_self->view.readonly;\n  __pyx_v_info->readonly = __pyx_t_1;\n\n  /* \"View.MemoryView\":547\n *         info.len = self.view.len\n *         info.readonly = self.view.readonly\n *         info.obj = self             # <<<<<<<<<<<<<<\n * \n *     __pyx_getbuffer = capsule(<void *> &__pyx_memoryview_getbuffer, \"getbuffer(obj, view, flags)\")\n */\n  __Pyx_INCREF(((PyObject *)__pyx_v_self));\n  __Pyx_GIVEREF(((PyObject *)__pyx_v_self));\n  __Pyx_GOTREF(__pyx_v_info->obj);\n  __Pyx_DECREF(__pyx_v_info->obj);\n  __pyx_v_info->obj = ((PyObject *)__pyx_v_self);\n\n  /* \"View.MemoryView\":518\n * \n *     @cname('getbuffer')\n *     def __getbuffer__(self, Py_buffer *info, int flags):             # <<<<<<<<<<<<<<\n *         if flags & PyBUF_WRITABLE and self.view.readonly:\n *             raise ValueError(\"Cannot create writable memory view from read-only memoryview\")\n */\n\n  /* function exit code */\n  __pyx_r = 0;\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.__getbuffer__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = -1;\n  if (__pyx_v_info->obj != NULL) {\n    __Pyx_GOTREF(__pyx_v_info->obj);\n    __Pyx_DECREF(__pyx_v_info->obj); __pyx_v_info->obj = 0;\n  }\n  goto __pyx_L2;\n  __pyx_L0:;\n  if (__pyx_v_info->obj == Py_None) {\n    __Pyx_GOTREF(__pyx_v_info->obj);\n    __Pyx_DECREF(__pyx_v_info->obj); __pyx_v_info->obj = 0;\n  }\n  __pyx_L2:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":553\n * \n *     @property\n *     def T(self):             # <<<<<<<<<<<<<<\n *         cdef _memoryviewslice result = memoryview_copy(self)\n *         transpose_memslice(&result.from_slice)\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_1T_1__get__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_1T_1__get__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__get__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf_15View_dot_MemoryView_10memoryview_1T___get__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_1T___get__(struct __pyx_memoryview_obj *__pyx_v_self) {\n  struct __pyx_memoryviewslice_obj *__pyx_v_result = 0;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_t_2;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__get__\", 0);\n\n  /* \"View.MemoryView\":554\n *     @property\n *     def T(self):\n *         cdef _memoryviewslice result = memoryview_copy(self)             # <<<<<<<<<<<<<<\n *         transpose_memslice(&result.from_slice)\n *         return result\n */\n  __pyx_t_1 = __pyx_memoryview_copy_object(__pyx_v_self); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 554, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  if (!(likely(((__pyx_t_1) == Py_None) || likely(__Pyx_TypeTest(__pyx_t_1, __pyx_memoryviewslice_type))))) __PYX_ERR(2, 554, __pyx_L1_error)\n  __pyx_v_result = ((struct __pyx_memoryviewslice_obj *)__pyx_t_1);\n  __pyx_t_1 = 0;\n\n  /* \"View.MemoryView\":555\n *     def T(self):\n *         cdef _memoryviewslice result = memoryview_copy(self)\n *         transpose_memslice(&result.from_slice)             # <<<<<<<<<<<<<<\n *         return result\n * \n */\n  __pyx_t_2 = __pyx_memslice_transpose((&__pyx_v_result->from_slice)); if (unlikely(__pyx_t_2 == ((int)0))) __PYX_ERR(2, 555, __pyx_L1_error)\n\n  /* \"View.MemoryView\":556\n *         cdef _memoryviewslice result = memoryview_copy(self)\n *         transpose_memslice(&result.from_slice)\n *         return result             # <<<<<<<<<<<<<<\n * \n *     @property\n */\n  __Pyx_XDECREF(__pyx_r);\n  __Pyx_INCREF(((PyObject *)__pyx_v_result));\n  __pyx_r = ((PyObject *)__pyx_v_result);\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":553\n * \n *     @property\n *     def T(self):             # <<<<<<<<<<<<<<\n *         cdef _memoryviewslice result = memoryview_copy(self)\n *         transpose_memslice(&result.from_slice)\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.T.__get__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XDECREF((PyObject *)__pyx_v_result);\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":559\n * \n *     @property\n *     def base(self):             # <<<<<<<<<<<<<<\n *         return self.obj\n * \n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_4base_1__get__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_4base_1__get__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__get__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf_15View_dot_MemoryView_10memoryview_4base___get__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_4base___get__(struct __pyx_memoryview_obj *__pyx_v_self) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__get__\", 0);\n\n  /* \"View.MemoryView\":560\n *     @property\n *     def base(self):\n *         return self.obj             # <<<<<<<<<<<<<<\n * \n *     @property\n */\n  __Pyx_XDECREF(__pyx_r);\n  __Pyx_INCREF(__pyx_v_self->obj);\n  __pyx_r = __pyx_v_self->obj;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":559\n * \n *     @property\n *     def base(self):             # <<<<<<<<<<<<<<\n *         return self.obj\n * \n */\n\n  /* function exit code */\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":563\n * \n *     @property\n *     def shape(self):             # <<<<<<<<<<<<<<\n *         return tuple([length for length in self.view.shape[:self.view.ndim]])\n * \n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_5shape_1__get__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_5shape_1__get__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__get__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf_15View_dot_MemoryView_10memoryview_5shape___get__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_5shape___get__(struct __pyx_memoryview_obj *__pyx_v_self) {\n  Py_ssize_t __pyx_v_length;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  Py_ssize_t *__pyx_t_2;\n  Py_ssize_t *__pyx_t_3;\n  Py_ssize_t *__pyx_t_4;\n  PyObject *__pyx_t_5 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__get__\", 0);\n\n  /* \"View.MemoryView\":564\n *     @property\n *     def shape(self):\n *         return tuple([length for length in self.view.shape[:self.view.ndim]])             # <<<<<<<<<<<<<<\n * \n *     @property\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = PyList_New(0); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 564, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_t_3 = (__pyx_v_self->view.shape + __pyx_v_self->view.ndim);\n  for (__pyx_t_4 = __pyx_v_self->view.shape; __pyx_t_4 < __pyx_t_3; __pyx_t_4++) {\n    __pyx_t_2 = __pyx_t_4;\n    __pyx_v_length = (__pyx_t_2[0]);\n    __pyx_t_5 = PyInt_FromSsize_t(__pyx_v_length); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 564, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_5);\n    if (unlikely(__Pyx_ListComp_Append(__pyx_t_1, (PyObject*)__pyx_t_5))) __PYX_ERR(2, 564, __pyx_L1_error)\n    __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0;\n  }\n  __pyx_t_5 = PyList_AsTuple(((PyObject*)__pyx_t_1)); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 564, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_5);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __pyx_r = __pyx_t_5;\n  __pyx_t_5 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":563\n * \n *     @property\n *     def shape(self):             # <<<<<<<<<<<<<<\n *         return tuple([length for length in self.view.shape[:self.view.ndim]])\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.shape.__get__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":567\n * \n *     @property\n *     def strides(self):             # <<<<<<<<<<<<<<\n *         if self.view.strides == NULL:\n * \n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_7strides_1__get__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_7strides_1__get__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__get__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf_15View_dot_MemoryView_10memoryview_7strides___get__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_7strides___get__(struct __pyx_memoryview_obj *__pyx_v_self) {\n  Py_ssize_t __pyx_v_stride;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  Py_ssize_t *__pyx_t_3;\n  Py_ssize_t *__pyx_t_4;\n  Py_ssize_t *__pyx_t_5;\n  PyObject *__pyx_t_6 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__get__\", 0);\n\n  /* \"View.MemoryView\":568\n *     @property\n *     def strides(self):\n *         if self.view.strides == NULL:             # <<<<<<<<<<<<<<\n * \n *             raise ValueError(\"Buffer view does not expose strides\")\n */\n  __pyx_t_1 = ((__pyx_v_self->view.strides == NULL) != 0);\n  if (unlikely(__pyx_t_1)) {\n\n    /* \"View.MemoryView\":570\n *         if self.view.strides == NULL:\n * \n *             raise ValueError(\"Buffer view does not expose strides\")             # <<<<<<<<<<<<<<\n * \n *         return tuple([stride for stride in self.view.strides[:self.view.ndim]])\n */\n    __pyx_t_2 = __Pyx_PyObject_Call(__pyx_builtin_ValueError, __pyx_tuple__14, NULL); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 570, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __Pyx_Raise(__pyx_t_2, 0, 0, 0);\n    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n    __PYX_ERR(2, 570, __pyx_L1_error)\n\n    /* \"View.MemoryView\":568\n *     @property\n *     def strides(self):\n *         if self.view.strides == NULL:             # <<<<<<<<<<<<<<\n * \n *             raise ValueError(\"Buffer view does not expose strides\")\n */\n  }\n\n  /* \"View.MemoryView\":572\n *             raise ValueError(\"Buffer view does not expose strides\")\n * \n *         return tuple([stride for stride in self.view.strides[:self.view.ndim]])             # <<<<<<<<<<<<<<\n * \n *     @property\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_2 = PyList_New(0); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 572, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __pyx_t_4 = (__pyx_v_self->view.strides + __pyx_v_self->view.ndim);\n  for (__pyx_t_5 = __pyx_v_self->view.strides; __pyx_t_5 < __pyx_t_4; __pyx_t_5++) {\n    __pyx_t_3 = __pyx_t_5;\n    __pyx_v_stride = (__pyx_t_3[0]);\n    __pyx_t_6 = PyInt_FromSsize_t(__pyx_v_stride); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 572, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_6);\n    if (unlikely(__Pyx_ListComp_Append(__pyx_t_2, (PyObject*)__pyx_t_6))) __PYX_ERR(2, 572, __pyx_L1_error)\n    __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;\n  }\n  __pyx_t_6 = PyList_AsTuple(((PyObject*)__pyx_t_2)); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 572, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_6);\n  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n  __pyx_r = __pyx_t_6;\n  __pyx_t_6 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":567\n * \n *     @property\n *     def strides(self):             # <<<<<<<<<<<<<<\n *         if self.view.strides == NULL:\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_6);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.strides.__get__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":575\n * \n *     @property\n *     def suboffsets(self):             # <<<<<<<<<<<<<<\n *         if self.view.suboffsets == NULL:\n *             return (-1,) * self.view.ndim\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_10suboffsets_1__get__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_10suboffsets_1__get__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__get__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf_15View_dot_MemoryView_10memoryview_10suboffsets___get__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_10suboffsets___get__(struct __pyx_memoryview_obj *__pyx_v_self) {\n  Py_ssize_t __pyx_v_suboffset;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  Py_ssize_t *__pyx_t_4;\n  Py_ssize_t *__pyx_t_5;\n  Py_ssize_t *__pyx_t_6;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__get__\", 0);\n\n  /* \"View.MemoryView\":576\n *     @property\n *     def suboffsets(self):\n *         if self.view.suboffsets == NULL:             # <<<<<<<<<<<<<<\n *             return (-1,) * self.view.ndim\n * \n */\n  __pyx_t_1 = ((__pyx_v_self->view.suboffsets == NULL) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":577\n *     def suboffsets(self):\n *         if self.view.suboffsets == NULL:\n *             return (-1,) * self.view.ndim             # <<<<<<<<<<<<<<\n * \n *         return tuple([suboffset for suboffset in self.view.suboffsets[:self.view.ndim]])\n */\n    __Pyx_XDECREF(__pyx_r);\n    __pyx_t_2 = __Pyx_PyInt_From_int(__pyx_v_self->view.ndim); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 577, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __pyx_t_3 = PyNumber_Multiply(__pyx_tuple__15, __pyx_t_2); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 577, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n    __pyx_r = __pyx_t_3;\n    __pyx_t_3 = 0;\n    goto __pyx_L0;\n\n    /* \"View.MemoryView\":576\n *     @property\n *     def suboffsets(self):\n *         if self.view.suboffsets == NULL:             # <<<<<<<<<<<<<<\n *             return (-1,) * self.view.ndim\n * \n */\n  }\n\n  /* \"View.MemoryView\":579\n *             return (-1,) * self.view.ndim\n * \n *         return tuple([suboffset for suboffset in self.view.suboffsets[:self.view.ndim]])             # <<<<<<<<<<<<<<\n * \n *     @property\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_3 = PyList_New(0); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 579, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  __pyx_t_5 = (__pyx_v_self->view.suboffsets + __pyx_v_self->view.ndim);\n  for (__pyx_t_6 = __pyx_v_self->view.suboffsets; __pyx_t_6 < __pyx_t_5; __pyx_t_6++) {\n    __pyx_t_4 = __pyx_t_6;\n    __pyx_v_suboffset = (__pyx_t_4[0]);\n    __pyx_t_2 = PyInt_FromSsize_t(__pyx_v_suboffset); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 579, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    if (unlikely(__Pyx_ListComp_Append(__pyx_t_3, (PyObject*)__pyx_t_2))) __PYX_ERR(2, 579, __pyx_L1_error)\n    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n  }\n  __pyx_t_2 = PyList_AsTuple(((PyObject*)__pyx_t_3)); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 579, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n  __pyx_r = __pyx_t_2;\n  __pyx_t_2 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":575\n * \n *     @property\n *     def suboffsets(self):             # <<<<<<<<<<<<<<\n *         if self.view.suboffsets == NULL:\n *             return (-1,) * self.view.ndim\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.suboffsets.__get__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":582\n * \n *     @property\n *     def ndim(self):             # <<<<<<<<<<<<<<\n *         return self.view.ndim\n * \n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_4ndim_1__get__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_4ndim_1__get__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__get__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf_15View_dot_MemoryView_10memoryview_4ndim___get__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_4ndim___get__(struct __pyx_memoryview_obj *__pyx_v_self) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__get__\", 0);\n\n  /* \"View.MemoryView\":583\n *     @property\n *     def ndim(self):\n *         return self.view.ndim             # <<<<<<<<<<<<<<\n * \n *     @property\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = __Pyx_PyInt_From_int(__pyx_v_self->view.ndim); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 583, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_r = __pyx_t_1;\n  __pyx_t_1 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":582\n * \n *     @property\n *     def ndim(self):             # <<<<<<<<<<<<<<\n *         return self.view.ndim\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.ndim.__get__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":586\n * \n *     @property\n *     def itemsize(self):             # <<<<<<<<<<<<<<\n *         return self.view.itemsize\n * \n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_8itemsize_1__get__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_8itemsize_1__get__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__get__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf_15View_dot_MemoryView_10memoryview_8itemsize___get__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_8itemsize___get__(struct __pyx_memoryview_obj *__pyx_v_self) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__get__\", 0);\n\n  /* \"View.MemoryView\":587\n *     @property\n *     def itemsize(self):\n *         return self.view.itemsize             # <<<<<<<<<<<<<<\n * \n *     @property\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = PyInt_FromSsize_t(__pyx_v_self->view.itemsize); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 587, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_r = __pyx_t_1;\n  __pyx_t_1 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":586\n * \n *     @property\n *     def itemsize(self):             # <<<<<<<<<<<<<<\n *         return self.view.itemsize\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.itemsize.__get__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":590\n * \n *     @property\n *     def nbytes(self):             # <<<<<<<<<<<<<<\n *         return self.size * self.view.itemsize\n * \n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_6nbytes_1__get__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_6nbytes_1__get__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__get__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf_15View_dot_MemoryView_10memoryview_6nbytes___get__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_6nbytes___get__(struct __pyx_memoryview_obj *__pyx_v_self) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__get__\", 0);\n\n  /* \"View.MemoryView\":591\n *     @property\n *     def nbytes(self):\n *         return self.size * self.view.itemsize             # <<<<<<<<<<<<<<\n * \n *     @property\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = __Pyx_PyObject_GetAttrStr(((PyObject *)__pyx_v_self), __pyx_n_s_size); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 591, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_t_2 = PyInt_FromSsize_t(__pyx_v_self->view.itemsize); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 591, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __pyx_t_3 = PyNumber_Multiply(__pyx_t_1, __pyx_t_2); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 591, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n  __pyx_r = __pyx_t_3;\n  __pyx_t_3 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":590\n * \n *     @property\n *     def nbytes(self):             # <<<<<<<<<<<<<<\n *         return self.size * self.view.itemsize\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.nbytes.__get__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":594\n * \n *     @property\n *     def size(self):             # <<<<<<<<<<<<<<\n *         if self._size is None:\n *             result = 1\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_4size_1__get__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_10memoryview_4size_1__get__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__get__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf_15View_dot_MemoryView_10memoryview_4size___get__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_10memoryview_4size___get__(struct __pyx_memoryview_obj *__pyx_v_self) {\n  PyObject *__pyx_v_result = NULL;\n  PyObject *__pyx_v_length = NULL;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  int __pyx_t_2;\n  Py_ssize_t *__pyx_t_3;\n  Py_ssize_t *__pyx_t_4;\n  Py_ssize_t *__pyx_t_5;\n  PyObject *__pyx_t_6 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__get__\", 0);\n\n  /* \"View.MemoryView\":595\n *     @property\n *     def size(self):\n *         if self._size is None:             # <<<<<<<<<<<<<<\n *             result = 1\n * \n */\n  __pyx_t_1 = (__pyx_v_self->_size == Py_None);\n  __pyx_t_2 = (__pyx_t_1 != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":596\n *     def size(self):\n *         if self._size is None:\n *             result = 1             # <<<<<<<<<<<<<<\n * \n *             for length in self.view.shape[:self.view.ndim]:\n */\n    __Pyx_INCREF(__pyx_int_1);\n    __pyx_v_result = __pyx_int_1;\n\n    /* \"View.MemoryView\":598\n *             result = 1\n * \n *             for length in self.view.shape[:self.view.ndim]:             # <<<<<<<<<<<<<<\n *                 result *= length\n * \n */\n    __pyx_t_4 = (__pyx_v_self->view.shape + __pyx_v_self->view.ndim);\n    for (__pyx_t_5 = __pyx_v_self->view.shape; __pyx_t_5 < __pyx_t_4; __pyx_t_5++) {\n      __pyx_t_3 = __pyx_t_5;\n      __pyx_t_6 = PyInt_FromSsize_t((__pyx_t_3[0])); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 598, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_6);\n      __Pyx_XDECREF_SET(__pyx_v_length, __pyx_t_6);\n      __pyx_t_6 = 0;\n\n      /* \"View.MemoryView\":599\n * \n *             for length in self.view.shape[:self.view.ndim]:\n *                 result *= length             # <<<<<<<<<<<<<<\n * \n *             self._size = result\n */\n      __pyx_t_6 = PyNumber_InPlaceMultiply(__pyx_v_result, __pyx_v_length); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 599, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_6);\n      __Pyx_DECREF_SET(__pyx_v_result, __pyx_t_6);\n      __pyx_t_6 = 0;\n    }\n\n    /* \"View.MemoryView\":601\n *                 result *= length\n * \n *             self._size = result             # <<<<<<<<<<<<<<\n * \n *         return self._size\n */\n    __Pyx_INCREF(__pyx_v_result);\n    __Pyx_GIVEREF(__pyx_v_result);\n    __Pyx_GOTREF(__pyx_v_self->_size);\n    __Pyx_DECREF(__pyx_v_self->_size);\n    __pyx_v_self->_size = __pyx_v_result;\n\n    /* \"View.MemoryView\":595\n *     @property\n *     def size(self):\n *         if self._size is None:             # <<<<<<<<<<<<<<\n *             result = 1\n * \n */\n  }\n\n  /* \"View.MemoryView\":603\n *             self._size = result\n * \n *         return self._size             # <<<<<<<<<<<<<<\n * \n *     def __len__(self):\n */\n  __Pyx_XDECREF(__pyx_r);\n  __Pyx_INCREF(__pyx_v_self->_size);\n  __pyx_r = __pyx_v_self->_size;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":594\n * \n *     @property\n *     def size(self):             # <<<<<<<<<<<<<<\n *         if self._size is None:\n *             result = 1\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_6);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.size.__get__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XDECREF(__pyx_v_result);\n  __Pyx_XDECREF(__pyx_v_length);\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":605\n *         return self._size\n * \n *     def __len__(self):             # <<<<<<<<<<<<<<\n *         if self.view.ndim >= 1:\n *             return self.view.shape[0]\n */\n\n/* Python wrapper */\nstatic Py_ssize_t __pyx_memoryview___len__(PyObject *__pyx_v_self); /*proto*/\nstatic Py_ssize_t __pyx_memoryview___len__(PyObject *__pyx_v_self) {\n  Py_ssize_t __pyx_r;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__len__ (wrapper)\", 0);\n  __pyx_r = __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_10__len__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic Py_ssize_t __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_10__len__(struct __pyx_memoryview_obj *__pyx_v_self) {\n  Py_ssize_t __pyx_r;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  __Pyx_RefNannySetupContext(\"__len__\", 0);\n\n  /* \"View.MemoryView\":606\n * \n *     def __len__(self):\n *         if self.view.ndim >= 1:             # <<<<<<<<<<<<<<\n *             return self.view.shape[0]\n * \n */\n  __pyx_t_1 = ((__pyx_v_self->view.ndim >= 1) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":607\n *     def __len__(self):\n *         if self.view.ndim >= 1:\n *             return self.view.shape[0]             # <<<<<<<<<<<<<<\n * \n *         return 0\n */\n    __pyx_r = (__pyx_v_self->view.shape[0]);\n    goto __pyx_L0;\n\n    /* \"View.MemoryView\":606\n * \n *     def __len__(self):\n *         if self.view.ndim >= 1:             # <<<<<<<<<<<<<<\n *             return self.view.shape[0]\n * \n */\n  }\n\n  /* \"View.MemoryView\":609\n *             return self.view.shape[0]\n * \n *         return 0             # <<<<<<<<<<<<<<\n * \n *     def __repr__(self):\n */\n  __pyx_r = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":605\n *         return self._size\n * \n *     def __len__(self):             # <<<<<<<<<<<<<<\n *         if self.view.ndim >= 1:\n *             return self.view.shape[0]\n */\n\n  /* function exit code */\n  __pyx_L0:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":611\n *         return 0\n * \n *     def __repr__(self):             # <<<<<<<<<<<<<<\n *         return \"<MemoryView of %r at 0x%x>\" % (self.base.__class__.__name__,\n *                                                id(self))\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_memoryview___repr__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_memoryview___repr__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__repr__ (wrapper)\", 0);\n  __pyx_r = __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_12__repr__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_12__repr__(struct __pyx_memoryview_obj *__pyx_v_self) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__repr__\", 0);\n\n  /* \"View.MemoryView\":612\n * \n *     def __repr__(self):\n *         return \"<MemoryView of %r at 0x%x>\" % (self.base.__class__.__name__,             # <<<<<<<<<<<<<<\n *                                                id(self))\n * \n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = __Pyx_PyObject_GetAttrStr(((PyObject *)__pyx_v_self), __pyx_n_s_base); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 612, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_t_2 = __Pyx_PyObject_GetAttrStr(__pyx_t_1, __pyx_n_s_class); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 612, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __pyx_t_1 = __Pyx_PyObject_GetAttrStr(__pyx_t_2, __pyx_n_s_name_2); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 612, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n\n  /* \"View.MemoryView\":613\n *     def __repr__(self):\n *         return \"<MemoryView of %r at 0x%x>\" % (self.base.__class__.__name__,\n *                                                id(self))             # <<<<<<<<<<<<<<\n * \n *     def __str__(self):\n */\n  __pyx_t_2 = __Pyx_PyObject_CallOneArg(__pyx_builtin_id, ((PyObject *)__pyx_v_self)); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 613, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n\n  /* \"View.MemoryView\":612\n * \n *     def __repr__(self):\n *         return \"<MemoryView of %r at 0x%x>\" % (self.base.__class__.__name__,             # <<<<<<<<<<<<<<\n *                                                id(self))\n * \n */\n  __pyx_t_3 = PyTuple_New(2); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 612, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  __Pyx_GIVEREF(__pyx_t_1);\n  PyTuple_SET_ITEM(__pyx_t_3, 0, __pyx_t_1);\n  __Pyx_GIVEREF(__pyx_t_2);\n  PyTuple_SET_ITEM(__pyx_t_3, 1, __pyx_t_2);\n  __pyx_t_1 = 0;\n  __pyx_t_2 = 0;\n  __pyx_t_2 = __Pyx_PyString_Format(__pyx_kp_s_MemoryView_of_r_at_0x_x, __pyx_t_3); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 612, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n  __pyx_r = __pyx_t_2;\n  __pyx_t_2 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":611\n *         return 0\n * \n *     def __repr__(self):             # <<<<<<<<<<<<<<\n *         return \"<MemoryView of %r at 0x%x>\" % (self.base.__class__.__name__,\n *                                                id(self))\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.__repr__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":615\n *                                                id(self))\n * \n *     def __str__(self):             # <<<<<<<<<<<<<<\n *         return \"<MemoryView of %r object>\" % (self.base.__class__.__name__,)\n * \n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_memoryview___str__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_memoryview___str__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__str__ (wrapper)\", 0);\n  __pyx_r = __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_14__str__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_14__str__(struct __pyx_memoryview_obj *__pyx_v_self) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  PyObject *__pyx_t_2 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__str__\", 0);\n\n  /* \"View.MemoryView\":616\n * \n *     def __str__(self):\n *         return \"<MemoryView of %r object>\" % (self.base.__class__.__name__,)             # <<<<<<<<<<<<<<\n * \n * \n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = __Pyx_PyObject_GetAttrStr(((PyObject *)__pyx_v_self), __pyx_n_s_base); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 616, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_t_2 = __Pyx_PyObject_GetAttrStr(__pyx_t_1, __pyx_n_s_class); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 616, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __pyx_t_1 = __Pyx_PyObject_GetAttrStr(__pyx_t_2, __pyx_n_s_name_2); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 616, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n  __pyx_t_2 = PyTuple_New(1); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 616, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __Pyx_GIVEREF(__pyx_t_1);\n  PyTuple_SET_ITEM(__pyx_t_2, 0, __pyx_t_1);\n  __pyx_t_1 = 0;\n  __pyx_t_1 = __Pyx_PyString_Format(__pyx_kp_s_MemoryView_of_r_object, __pyx_t_2); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 616, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n  __pyx_r = __pyx_t_1;\n  __pyx_t_1 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":615\n *                                                id(self))\n * \n *     def __str__(self):             # <<<<<<<<<<<<<<\n *         return \"<MemoryView of %r object>\" % (self.base.__class__.__name__,)\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.__str__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":619\n * \n * \n *     def is_c_contig(self):             # <<<<<<<<<<<<<<\n *         cdef __Pyx_memviewslice *mslice\n *         cdef __Pyx_memviewslice tmp\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_memoryview_is_c_contig(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused); /*proto*/\nstatic PyObject *__pyx_memoryview_is_c_contig(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"is_c_contig (wrapper)\", 0);\n  __pyx_r = __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_16is_c_contig(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_16is_c_contig(struct __pyx_memoryview_obj *__pyx_v_self) {\n  __Pyx_memviewslice *__pyx_v_mslice;\n  __Pyx_memviewslice __pyx_v_tmp;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  __Pyx_memviewslice *__pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"is_c_contig\", 0);\n\n  /* \"View.MemoryView\":622\n *         cdef __Pyx_memviewslice *mslice\n *         cdef __Pyx_memviewslice tmp\n *         mslice = get_slice_from_memview(self, &tmp)             # <<<<<<<<<<<<<<\n *         return slice_is_contig(mslice[0], 'C', self.view.ndim)\n * \n */\n  __pyx_t_1 = __pyx_memoryview_get_slice_from_memoryview(__pyx_v_self, (&__pyx_v_tmp)); if (unlikely(__pyx_t_1 == ((__Pyx_memviewslice *)NULL))) __PYX_ERR(2, 622, __pyx_L1_error)\n  __pyx_v_mslice = __pyx_t_1;\n\n  /* \"View.MemoryView\":623\n *         cdef __Pyx_memviewslice tmp\n *         mslice = get_slice_from_memview(self, &tmp)\n *         return slice_is_contig(mslice[0], 'C', self.view.ndim)             # <<<<<<<<<<<<<<\n * \n *     def is_f_contig(self):\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_2 = __Pyx_PyBool_FromLong(__pyx_memviewslice_is_contig((__pyx_v_mslice[0]), 'C', __pyx_v_self->view.ndim)); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 623, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __pyx_r = __pyx_t_2;\n  __pyx_t_2 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":619\n * \n * \n *     def is_c_contig(self):             # <<<<<<<<<<<<<<\n *         cdef __Pyx_memviewslice *mslice\n *         cdef __Pyx_memviewslice tmp\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.is_c_contig\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":625\n *         return slice_is_contig(mslice[0], 'C', self.view.ndim)\n * \n *     def is_f_contig(self):             # <<<<<<<<<<<<<<\n *         cdef __Pyx_memviewslice *mslice\n *         cdef __Pyx_memviewslice tmp\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_memoryview_is_f_contig(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused); /*proto*/\nstatic PyObject *__pyx_memoryview_is_f_contig(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"is_f_contig (wrapper)\", 0);\n  __pyx_r = __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_18is_f_contig(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_18is_f_contig(struct __pyx_memoryview_obj *__pyx_v_self) {\n  __Pyx_memviewslice *__pyx_v_mslice;\n  __Pyx_memviewslice __pyx_v_tmp;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  __Pyx_memviewslice *__pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"is_f_contig\", 0);\n\n  /* \"View.MemoryView\":628\n *         cdef __Pyx_memviewslice *mslice\n *         cdef __Pyx_memviewslice tmp\n *         mslice = get_slice_from_memview(self, &tmp)             # <<<<<<<<<<<<<<\n *         return slice_is_contig(mslice[0], 'F', self.view.ndim)\n * \n */\n  __pyx_t_1 = __pyx_memoryview_get_slice_from_memoryview(__pyx_v_self, (&__pyx_v_tmp)); if (unlikely(__pyx_t_1 == ((__Pyx_memviewslice *)NULL))) __PYX_ERR(2, 628, __pyx_L1_error)\n  __pyx_v_mslice = __pyx_t_1;\n\n  /* \"View.MemoryView\":629\n *         cdef __Pyx_memviewslice tmp\n *         mslice = get_slice_from_memview(self, &tmp)\n *         return slice_is_contig(mslice[0], 'F', self.view.ndim)             # <<<<<<<<<<<<<<\n * \n *     def copy(self):\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_2 = __Pyx_PyBool_FromLong(__pyx_memviewslice_is_contig((__pyx_v_mslice[0]), 'F', __pyx_v_self->view.ndim)); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 629, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __pyx_r = __pyx_t_2;\n  __pyx_t_2 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":625\n *         return slice_is_contig(mslice[0], 'C', self.view.ndim)\n * \n *     def is_f_contig(self):             # <<<<<<<<<<<<<<\n *         cdef __Pyx_memviewslice *mslice\n *         cdef __Pyx_memviewslice tmp\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.is_f_contig\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":631\n *         return slice_is_contig(mslice[0], 'F', self.view.ndim)\n * \n *     def copy(self):             # <<<<<<<<<<<<<<\n *         cdef __Pyx_memviewslice mslice\n *         cdef int flags = self.flags & ~PyBUF_F_CONTIGUOUS\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_memoryview_copy(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused); /*proto*/\nstatic PyObject *__pyx_memoryview_copy(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"copy (wrapper)\", 0);\n  __pyx_r = __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_20copy(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_20copy(struct __pyx_memoryview_obj *__pyx_v_self) {\n  __Pyx_memviewslice __pyx_v_mslice;\n  int __pyx_v_flags;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  __Pyx_memviewslice __pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"copy\", 0);\n\n  /* \"View.MemoryView\":633\n *     def copy(self):\n *         cdef __Pyx_memviewslice mslice\n *         cdef int flags = self.flags & ~PyBUF_F_CONTIGUOUS             # <<<<<<<<<<<<<<\n * \n *         slice_copy(self, &mslice)\n */\n  __pyx_v_flags = (__pyx_v_self->flags & (~PyBUF_F_CONTIGUOUS));\n\n  /* \"View.MemoryView\":635\n *         cdef int flags = self.flags & ~PyBUF_F_CONTIGUOUS\n * \n *         slice_copy(self, &mslice)             # <<<<<<<<<<<<<<\n *         mslice = slice_copy_contig(&mslice, \"c\", self.view.ndim,\n *                                    self.view.itemsize,\n */\n  __pyx_memoryview_slice_copy(__pyx_v_self, (&__pyx_v_mslice));\n\n  /* \"View.MemoryView\":636\n * \n *         slice_copy(self, &mslice)\n *         mslice = slice_copy_contig(&mslice, \"c\", self.view.ndim,             # <<<<<<<<<<<<<<\n *                                    self.view.itemsize,\n *                                    flags|PyBUF_C_CONTIGUOUS,\n */\n  __pyx_t_1 = __pyx_memoryview_copy_new_contig((&__pyx_v_mslice), ((char *)\"c\"), __pyx_v_self->view.ndim, __pyx_v_self->view.itemsize, (__pyx_v_flags | PyBUF_C_CONTIGUOUS), __pyx_v_self->dtype_is_object); if (unlikely(PyErr_Occurred())) __PYX_ERR(2, 636, __pyx_L1_error)\n  __pyx_v_mslice = __pyx_t_1;\n\n  /* \"View.MemoryView\":641\n *                                    self.dtype_is_object)\n * \n *         return memoryview_copy_from_slice(self, &mslice)             # <<<<<<<<<<<<<<\n * \n *     def copy_fortran(self):\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_2 = __pyx_memoryview_copy_object_from_slice(__pyx_v_self, (&__pyx_v_mslice)); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 641, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __pyx_r = __pyx_t_2;\n  __pyx_t_2 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":631\n *         return slice_is_contig(mslice[0], 'F', self.view.ndim)\n * \n *     def copy(self):             # <<<<<<<<<<<<<<\n *         cdef __Pyx_memviewslice mslice\n *         cdef int flags = self.flags & ~PyBUF_F_CONTIGUOUS\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.copy\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":643\n *         return memoryview_copy_from_slice(self, &mslice)\n * \n *     def copy_fortran(self):             # <<<<<<<<<<<<<<\n *         cdef __Pyx_memviewslice src, dst\n *         cdef int flags = self.flags & ~PyBUF_C_CONTIGUOUS\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_memoryview_copy_fortran(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused); /*proto*/\nstatic PyObject *__pyx_memoryview_copy_fortran(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"copy_fortran (wrapper)\", 0);\n  __pyx_r = __pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_22copy_fortran(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_memoryview___pyx_pf_15View_dot_MemoryView_10memoryview_22copy_fortran(struct __pyx_memoryview_obj *__pyx_v_self) {\n  __Pyx_memviewslice __pyx_v_src;\n  __Pyx_memviewslice __pyx_v_dst;\n  int __pyx_v_flags;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  __Pyx_memviewslice __pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"copy_fortran\", 0);\n\n  /* \"View.MemoryView\":645\n *     def copy_fortran(self):\n *         cdef __Pyx_memviewslice src, dst\n *         cdef int flags = self.flags & ~PyBUF_C_CONTIGUOUS             # <<<<<<<<<<<<<<\n * \n *         slice_copy(self, &src)\n */\n  __pyx_v_flags = (__pyx_v_self->flags & (~PyBUF_C_CONTIGUOUS));\n\n  /* \"View.MemoryView\":647\n *         cdef int flags = self.flags & ~PyBUF_C_CONTIGUOUS\n * \n *         slice_copy(self, &src)             # <<<<<<<<<<<<<<\n *         dst = slice_copy_contig(&src, \"fortran\", self.view.ndim,\n *                                 self.view.itemsize,\n */\n  __pyx_memoryview_slice_copy(__pyx_v_self, (&__pyx_v_src));\n\n  /* \"View.MemoryView\":648\n * \n *         slice_copy(self, &src)\n *         dst = slice_copy_contig(&src, \"fortran\", self.view.ndim,             # <<<<<<<<<<<<<<\n *                                 self.view.itemsize,\n *                                 flags|PyBUF_F_CONTIGUOUS,\n */\n  __pyx_t_1 = __pyx_memoryview_copy_new_contig((&__pyx_v_src), ((char *)\"fortran\"), __pyx_v_self->view.ndim, __pyx_v_self->view.itemsize, (__pyx_v_flags | PyBUF_F_CONTIGUOUS), __pyx_v_self->dtype_is_object); if (unlikely(PyErr_Occurred())) __PYX_ERR(2, 648, __pyx_L1_error)\n  __pyx_v_dst = __pyx_t_1;\n\n  /* \"View.MemoryView\":653\n *                                 self.dtype_is_object)\n * \n *         return memoryview_copy_from_slice(self, &dst)             # <<<<<<<<<<<<<<\n * \n * \n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_2 = __pyx_memoryview_copy_object_from_slice(__pyx_v_self, (&__pyx_v_dst)); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 653, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __pyx_r = __pyx_t_2;\n  __pyx_t_2 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":643\n *         return memoryview_copy_from_slice(self, &mslice)\n * \n *     def copy_fortran(self):             # <<<<<<<<<<<<<<\n *         cdef __Pyx_memviewslice src, dst\n *         cdef int flags = self.flags & ~PyBUF_C_CONTIGUOUS\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.copy_fortran\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"(tree fragment)\":1\n * def __reduce_cython__(self):             # <<<<<<<<<<<<<<\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw___pyx_memoryview_1__reduce_cython__(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused); /*proto*/\nstatic PyObject *__pyx_pw___pyx_memoryview_1__reduce_cython__(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__reduce_cython__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf___pyx_memoryview___reduce_cython__(((struct __pyx_memoryview_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf___pyx_memoryview___reduce_cython__(CYTHON_UNUSED struct __pyx_memoryview_obj *__pyx_v_self) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__reduce_cython__\", 0);\n\n  /* \"(tree fragment)\":2\n * def __reduce_cython__(self):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")             # <<<<<<<<<<<<<<\n * def __setstate_cython__(self, __pyx_state):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n */\n  __pyx_t_1 = __Pyx_PyObject_Call(__pyx_builtin_TypeError, __pyx_tuple__16, NULL); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 2, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_Raise(__pyx_t_1, 0, 0, 0);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __PYX_ERR(2, 2, __pyx_L1_error)\n\n  /* \"(tree fragment)\":1\n * def __reduce_cython__(self):             # <<<<<<<<<<<<<<\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.__reduce_cython__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"(tree fragment)\":3\n * def __reduce_cython__(self):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):             # <<<<<<<<<<<<<<\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw___pyx_memoryview_3__setstate_cython__(PyObject *__pyx_v_self, PyObject *__pyx_v___pyx_state); /*proto*/\nstatic PyObject *__pyx_pw___pyx_memoryview_3__setstate_cython__(PyObject *__pyx_v_self, PyObject *__pyx_v___pyx_state) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__setstate_cython__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf___pyx_memoryview_2__setstate_cython__(((struct __pyx_memoryview_obj *)__pyx_v_self), ((PyObject *)__pyx_v___pyx_state));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf___pyx_memoryview_2__setstate_cython__(CYTHON_UNUSED struct __pyx_memoryview_obj *__pyx_v_self, CYTHON_UNUSED PyObject *__pyx_v___pyx_state) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__setstate_cython__\", 0);\n\n  /* \"(tree fragment)\":4\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")             # <<<<<<<<<<<<<<\n */\n  __pyx_t_1 = __Pyx_PyObject_Call(__pyx_builtin_TypeError, __pyx_tuple__17, NULL); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 4, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_Raise(__pyx_t_1, 0, 0, 0);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __PYX_ERR(2, 4, __pyx_L1_error)\n\n  /* \"(tree fragment)\":3\n * def __reduce_cython__(self):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):             # <<<<<<<<<<<<<<\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview.__setstate_cython__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":657\n * \n * @cname('__pyx_memoryview_new')\n * cdef memoryview_cwrapper(object o, int flags, bint dtype_is_object, __Pyx_TypeInfo *typeinfo):             # <<<<<<<<<<<<<<\n *     cdef memoryview result = memoryview(o, flags, dtype_is_object)\n *     result.typeinfo = typeinfo\n */\n\nstatic PyObject *__pyx_memoryview_new(PyObject *__pyx_v_o, int __pyx_v_flags, int __pyx_v_dtype_is_object, __Pyx_TypeInfo *__pyx_v_typeinfo) {\n  struct __pyx_memoryview_obj *__pyx_v_result = 0;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"memoryview_cwrapper\", 0);\n\n  /* \"View.MemoryView\":658\n * @cname('__pyx_memoryview_new')\n * cdef memoryview_cwrapper(object o, int flags, bint dtype_is_object, __Pyx_TypeInfo *typeinfo):\n *     cdef memoryview result = memoryview(o, flags, dtype_is_object)             # <<<<<<<<<<<<<<\n *     result.typeinfo = typeinfo\n *     return result\n */\n  __pyx_t_1 = __Pyx_PyInt_From_int(__pyx_v_flags); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 658, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_t_2 = __Pyx_PyBool_FromLong(__pyx_v_dtype_is_object); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 658, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __pyx_t_3 = PyTuple_New(3); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 658, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  __Pyx_INCREF(__pyx_v_o);\n  __Pyx_GIVEREF(__pyx_v_o);\n  PyTuple_SET_ITEM(__pyx_t_3, 0, __pyx_v_o);\n  __Pyx_GIVEREF(__pyx_t_1);\n  PyTuple_SET_ITEM(__pyx_t_3, 1, __pyx_t_1);\n  __Pyx_GIVEREF(__pyx_t_2);\n  PyTuple_SET_ITEM(__pyx_t_3, 2, __pyx_t_2);\n  __pyx_t_1 = 0;\n  __pyx_t_2 = 0;\n  __pyx_t_2 = __Pyx_PyObject_Call(((PyObject *)__pyx_memoryview_type), __pyx_t_3, NULL); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 658, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n  __pyx_v_result = ((struct __pyx_memoryview_obj *)__pyx_t_2);\n  __pyx_t_2 = 0;\n\n  /* \"View.MemoryView\":659\n * cdef memoryview_cwrapper(object o, int flags, bint dtype_is_object, __Pyx_TypeInfo *typeinfo):\n *     cdef memoryview result = memoryview(o, flags, dtype_is_object)\n *     result.typeinfo = typeinfo             # <<<<<<<<<<<<<<\n *     return result\n * \n */\n  __pyx_v_result->typeinfo = __pyx_v_typeinfo;\n\n  /* \"View.MemoryView\":660\n *     cdef memoryview result = memoryview(o, flags, dtype_is_object)\n *     result.typeinfo = typeinfo\n *     return result             # <<<<<<<<<<<<<<\n * \n * @cname('__pyx_memoryview_check')\n */\n  __Pyx_XDECREF(__pyx_r);\n  __Pyx_INCREF(((PyObject *)__pyx_v_result));\n  __pyx_r = ((PyObject *)__pyx_v_result);\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":657\n * \n * @cname('__pyx_memoryview_new')\n * cdef memoryview_cwrapper(object o, int flags, bint dtype_is_object, __Pyx_TypeInfo *typeinfo):             # <<<<<<<<<<<<<<\n *     cdef memoryview result = memoryview(o, flags, dtype_is_object)\n *     result.typeinfo = typeinfo\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview_cwrapper\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XDECREF((PyObject *)__pyx_v_result);\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":663\n * \n * @cname('__pyx_memoryview_check')\n * cdef inline bint memoryview_check(object o):             # <<<<<<<<<<<<<<\n *     return isinstance(o, memoryview)\n * \n */\n\nstatic CYTHON_INLINE int __pyx_memoryview_check(PyObject *__pyx_v_o) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  __Pyx_RefNannySetupContext(\"memoryview_check\", 0);\n\n  /* \"View.MemoryView\":664\n * @cname('__pyx_memoryview_check')\n * cdef inline bint memoryview_check(object o):\n *     return isinstance(o, memoryview)             # <<<<<<<<<<<<<<\n * \n * cdef tuple _unellipsify(object index, int ndim):\n */\n  __pyx_t_1 = __Pyx_TypeCheck(__pyx_v_o, __pyx_memoryview_type); \n  __pyx_r = __pyx_t_1;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":663\n * \n * @cname('__pyx_memoryview_check')\n * cdef inline bint memoryview_check(object o):             # <<<<<<<<<<<<<<\n *     return isinstance(o, memoryview)\n * \n */\n\n  /* function exit code */\n  __pyx_L0:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":666\n *     return isinstance(o, memoryview)\n * \n * cdef tuple _unellipsify(object index, int ndim):             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     Replace all ellipses with full slices and fill incomplete indices with\n */\n\nstatic PyObject *_unellipsify(PyObject *__pyx_v_index, int __pyx_v_ndim) {\n  PyObject *__pyx_v_tup = NULL;\n  PyObject *__pyx_v_result = NULL;\n  int __pyx_v_have_slices;\n  int __pyx_v_seen_ellipsis;\n  CYTHON_UNUSED PyObject *__pyx_v_idx = NULL;\n  PyObject *__pyx_v_item = NULL;\n  Py_ssize_t __pyx_v_nslices;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  int __pyx_t_2;\n  PyObject *__pyx_t_3 = NULL;\n  PyObject *__pyx_t_4 = NULL;\n  Py_ssize_t __pyx_t_5;\n  PyObject *(*__pyx_t_6)(PyObject *);\n  PyObject *__pyx_t_7 = NULL;\n  Py_ssize_t __pyx_t_8;\n  int __pyx_t_9;\n  int __pyx_t_10;\n  PyObject *__pyx_t_11 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"_unellipsify\", 0);\n\n  /* \"View.MemoryView\":671\n *     full slices.\n *     \"\"\"\n *     if not isinstance(index, tuple):             # <<<<<<<<<<<<<<\n *         tup = (index,)\n *     else:\n */\n  __pyx_t_1 = PyTuple_Check(__pyx_v_index); \n  __pyx_t_2 = ((!(__pyx_t_1 != 0)) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":672\n *     \"\"\"\n *     if not isinstance(index, tuple):\n *         tup = (index,)             # <<<<<<<<<<<<<<\n *     else:\n *         tup = index\n */\n    __pyx_t_3 = PyTuple_New(1); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 672, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_INCREF(__pyx_v_index);\n    __Pyx_GIVEREF(__pyx_v_index);\n    PyTuple_SET_ITEM(__pyx_t_3, 0, __pyx_v_index);\n    __pyx_v_tup = __pyx_t_3;\n    __pyx_t_3 = 0;\n\n    /* \"View.MemoryView\":671\n *     full slices.\n *     \"\"\"\n *     if not isinstance(index, tuple):             # <<<<<<<<<<<<<<\n *         tup = (index,)\n *     else:\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":674\n *         tup = (index,)\n *     else:\n *         tup = index             # <<<<<<<<<<<<<<\n * \n *     result = []\n */\n  /*else*/ {\n    __Pyx_INCREF(__pyx_v_index);\n    __pyx_v_tup = __pyx_v_index;\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":676\n *         tup = index\n * \n *     result = []             # <<<<<<<<<<<<<<\n *     have_slices = False\n *     seen_ellipsis = False\n */\n  __pyx_t_3 = PyList_New(0); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 676, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  __pyx_v_result = ((PyObject*)__pyx_t_3);\n  __pyx_t_3 = 0;\n\n  /* \"View.MemoryView\":677\n * \n *     result = []\n *     have_slices = False             # <<<<<<<<<<<<<<\n *     seen_ellipsis = False\n *     for idx, item in enumerate(tup):\n */\n  __pyx_v_have_slices = 0;\n\n  /* \"View.MemoryView\":678\n *     result = []\n *     have_slices = False\n *     seen_ellipsis = False             # <<<<<<<<<<<<<<\n *     for idx, item in enumerate(tup):\n *         if item is Ellipsis:\n */\n  __pyx_v_seen_ellipsis = 0;\n\n  /* \"View.MemoryView\":679\n *     have_slices = False\n *     seen_ellipsis = False\n *     for idx, item in enumerate(tup):             # <<<<<<<<<<<<<<\n *         if item is Ellipsis:\n *             if not seen_ellipsis:\n */\n  __Pyx_INCREF(__pyx_int_0);\n  __pyx_t_3 = __pyx_int_0;\n  if (likely(PyList_CheckExact(__pyx_v_tup)) || PyTuple_CheckExact(__pyx_v_tup)) {\n    __pyx_t_4 = __pyx_v_tup; __Pyx_INCREF(__pyx_t_4); __pyx_t_5 = 0;\n    __pyx_t_6 = NULL;\n  } else {\n    __pyx_t_5 = -1; __pyx_t_4 = PyObject_GetIter(__pyx_v_tup); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 679, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __pyx_t_6 = Py_TYPE(__pyx_t_4)->tp_iternext; if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 679, __pyx_L1_error)\n  }\n  for (;;) {\n    if (likely(!__pyx_t_6)) {\n      if (likely(PyList_CheckExact(__pyx_t_4))) {\n        if (__pyx_t_5 >= PyList_GET_SIZE(__pyx_t_4)) break;\n        #if CYTHON_ASSUME_SAFE_MACROS && !CYTHON_AVOID_BORROWED_REFS\n        __pyx_t_7 = PyList_GET_ITEM(__pyx_t_4, __pyx_t_5); __Pyx_INCREF(__pyx_t_7); __pyx_t_5++; if (unlikely(0 < 0)) __PYX_ERR(2, 679, __pyx_L1_error)\n        #else\n        __pyx_t_7 = PySequence_ITEM(__pyx_t_4, __pyx_t_5); __pyx_t_5++; if (unlikely(!__pyx_t_7)) __PYX_ERR(2, 679, __pyx_L1_error)\n        __Pyx_GOTREF(__pyx_t_7);\n        #endif\n      } else {\n        if (__pyx_t_5 >= PyTuple_GET_SIZE(__pyx_t_4)) break;\n        #if CYTHON_ASSUME_SAFE_MACROS && !CYTHON_AVOID_BORROWED_REFS\n        __pyx_t_7 = PyTuple_GET_ITEM(__pyx_t_4, __pyx_t_5); __Pyx_INCREF(__pyx_t_7); __pyx_t_5++; if (unlikely(0 < 0)) __PYX_ERR(2, 679, __pyx_L1_error)\n        #else\n        __pyx_t_7 = PySequence_ITEM(__pyx_t_4, __pyx_t_5); __pyx_t_5++; if (unlikely(!__pyx_t_7)) __PYX_ERR(2, 679, __pyx_L1_error)\n        __Pyx_GOTREF(__pyx_t_7);\n        #endif\n      }\n    } else {\n      __pyx_t_7 = __pyx_t_6(__pyx_t_4);\n      if (unlikely(!__pyx_t_7)) {\n        PyObject* exc_type = PyErr_Occurred();\n        if (exc_type) {\n          if (likely(__Pyx_PyErr_GivenExceptionMatches(exc_type, PyExc_StopIteration))) PyErr_Clear();\n          else __PYX_ERR(2, 679, __pyx_L1_error)\n        }\n        break;\n      }\n      __Pyx_GOTREF(__pyx_t_7);\n    }\n    __Pyx_XDECREF_SET(__pyx_v_item, __pyx_t_7);\n    __pyx_t_7 = 0;\n    __Pyx_INCREF(__pyx_t_3);\n    __Pyx_XDECREF_SET(__pyx_v_idx, __pyx_t_3);\n    __pyx_t_7 = __Pyx_PyInt_AddObjC(__pyx_t_3, __pyx_int_1, 1, 0, 0); if (unlikely(!__pyx_t_7)) __PYX_ERR(2, 679, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_7);\n    __Pyx_DECREF(__pyx_t_3);\n    __pyx_t_3 = __pyx_t_7;\n    __pyx_t_7 = 0;\n\n    /* \"View.MemoryView\":680\n *     seen_ellipsis = False\n *     for idx, item in enumerate(tup):\n *         if item is Ellipsis:             # <<<<<<<<<<<<<<\n *             if not seen_ellipsis:\n *                 result.extend([slice(None)] * (ndim - len(tup) + 1))\n */\n    __pyx_t_2 = (__pyx_v_item == __pyx_builtin_Ellipsis);\n    __pyx_t_1 = (__pyx_t_2 != 0);\n    if (__pyx_t_1) {\n\n      /* \"View.MemoryView\":681\n *     for idx, item in enumerate(tup):\n *         if item is Ellipsis:\n *             if not seen_ellipsis:             # <<<<<<<<<<<<<<\n *                 result.extend([slice(None)] * (ndim - len(tup) + 1))\n *                 seen_ellipsis = True\n */\n      __pyx_t_1 = ((!(__pyx_v_seen_ellipsis != 0)) != 0);\n      if (__pyx_t_1) {\n\n        /* \"View.MemoryView\":682\n *         if item is Ellipsis:\n *             if not seen_ellipsis:\n *                 result.extend([slice(None)] * (ndim - len(tup) + 1))             # <<<<<<<<<<<<<<\n *                 seen_ellipsis = True\n *             else:\n */\n        __pyx_t_8 = PyObject_Length(__pyx_v_tup); if (unlikely(__pyx_t_8 == ((Py_ssize_t)-1))) __PYX_ERR(2, 682, __pyx_L1_error)\n        __pyx_t_7 = PyList_New(1 * ((((__pyx_v_ndim - __pyx_t_8) + 1)<0) ? 0:((__pyx_v_ndim - __pyx_t_8) + 1))); if (unlikely(!__pyx_t_7)) __PYX_ERR(2, 682, __pyx_L1_error)\n        __Pyx_GOTREF(__pyx_t_7);\n        { Py_ssize_t __pyx_temp;\n          for (__pyx_temp=0; __pyx_temp < ((__pyx_v_ndim - __pyx_t_8) + 1); __pyx_temp++) {\n            __Pyx_INCREF(__pyx_slice__18);\n            __Pyx_GIVEREF(__pyx_slice__18);\n            PyList_SET_ITEM(__pyx_t_7, __pyx_temp, __pyx_slice__18);\n          }\n        }\n        __pyx_t_9 = __Pyx_PyList_Extend(__pyx_v_result, __pyx_t_7); if (unlikely(__pyx_t_9 == ((int)-1))) __PYX_ERR(2, 682, __pyx_L1_error)\n        __Pyx_DECREF(__pyx_t_7); __pyx_t_7 = 0;\n\n        /* \"View.MemoryView\":683\n *             if not seen_ellipsis:\n *                 result.extend([slice(None)] * (ndim - len(tup) + 1))\n *                 seen_ellipsis = True             # <<<<<<<<<<<<<<\n *             else:\n *                 result.append(slice(None))\n */\n        __pyx_v_seen_ellipsis = 1;\n\n        /* \"View.MemoryView\":681\n *     for idx, item in enumerate(tup):\n *         if item is Ellipsis:\n *             if not seen_ellipsis:             # <<<<<<<<<<<<<<\n *                 result.extend([slice(None)] * (ndim - len(tup) + 1))\n *                 seen_ellipsis = True\n */\n        goto __pyx_L7;\n      }\n\n      /* \"View.MemoryView\":685\n *                 seen_ellipsis = True\n *             else:\n *                 result.append(slice(None))             # <<<<<<<<<<<<<<\n *             have_slices = True\n *         else:\n */\n      /*else*/ {\n        __pyx_t_9 = __Pyx_PyList_Append(__pyx_v_result, __pyx_slice__18); if (unlikely(__pyx_t_9 == ((int)-1))) __PYX_ERR(2, 685, __pyx_L1_error)\n      }\n      __pyx_L7:;\n\n      /* \"View.MemoryView\":686\n *             else:\n *                 result.append(slice(None))\n *             have_slices = True             # <<<<<<<<<<<<<<\n *         else:\n *             if not isinstance(item, slice) and not PyIndex_Check(item):\n */\n      __pyx_v_have_slices = 1;\n\n      /* \"View.MemoryView\":680\n *     seen_ellipsis = False\n *     for idx, item in enumerate(tup):\n *         if item is Ellipsis:             # <<<<<<<<<<<<<<\n *             if not seen_ellipsis:\n *                 result.extend([slice(None)] * (ndim - len(tup) + 1))\n */\n      goto __pyx_L6;\n    }\n\n    /* \"View.MemoryView\":688\n *             have_slices = True\n *         else:\n *             if not isinstance(item, slice) and not PyIndex_Check(item):             # <<<<<<<<<<<<<<\n *                 raise TypeError(\"Cannot index with type '%s'\" % type(item))\n * \n */\n    /*else*/ {\n      __pyx_t_2 = PySlice_Check(__pyx_v_item); \n      __pyx_t_10 = ((!(__pyx_t_2 != 0)) != 0);\n      if (__pyx_t_10) {\n      } else {\n        __pyx_t_1 = __pyx_t_10;\n        goto __pyx_L9_bool_binop_done;\n      }\n      __pyx_t_10 = ((!(PyIndex_Check(__pyx_v_item) != 0)) != 0);\n      __pyx_t_1 = __pyx_t_10;\n      __pyx_L9_bool_binop_done:;\n      if (unlikely(__pyx_t_1)) {\n\n        /* \"View.MemoryView\":689\n *         else:\n *             if not isinstance(item, slice) and not PyIndex_Check(item):\n *                 raise TypeError(\"Cannot index with type '%s'\" % type(item))             # <<<<<<<<<<<<<<\n * \n *             have_slices = have_slices or isinstance(item, slice)\n */\n        __pyx_t_7 = __Pyx_PyString_FormatSafe(__pyx_kp_s_Cannot_index_with_type_s, ((PyObject *)Py_TYPE(__pyx_v_item))); if (unlikely(!__pyx_t_7)) __PYX_ERR(2, 689, __pyx_L1_error)\n        __Pyx_GOTREF(__pyx_t_7);\n        __pyx_t_11 = __Pyx_PyObject_CallOneArg(__pyx_builtin_TypeError, __pyx_t_7); if (unlikely(!__pyx_t_11)) __PYX_ERR(2, 689, __pyx_L1_error)\n        __Pyx_GOTREF(__pyx_t_11);\n        __Pyx_DECREF(__pyx_t_7); __pyx_t_7 = 0;\n        __Pyx_Raise(__pyx_t_11, 0, 0, 0);\n        __Pyx_DECREF(__pyx_t_11); __pyx_t_11 = 0;\n        __PYX_ERR(2, 689, __pyx_L1_error)\n\n        /* \"View.MemoryView\":688\n *             have_slices = True\n *         else:\n *             if not isinstance(item, slice) and not PyIndex_Check(item):             # <<<<<<<<<<<<<<\n *                 raise TypeError(\"Cannot index with type '%s'\" % type(item))\n * \n */\n      }\n\n      /* \"View.MemoryView\":691\n *                 raise TypeError(\"Cannot index with type '%s'\" % type(item))\n * \n *             have_slices = have_slices or isinstance(item, slice)             # <<<<<<<<<<<<<<\n *             result.append(item)\n * \n */\n      __pyx_t_10 = (__pyx_v_have_slices != 0);\n      if (!__pyx_t_10) {\n      } else {\n        __pyx_t_1 = __pyx_t_10;\n        goto __pyx_L11_bool_binop_done;\n      }\n      __pyx_t_10 = PySlice_Check(__pyx_v_item); \n      __pyx_t_2 = (__pyx_t_10 != 0);\n      __pyx_t_1 = __pyx_t_2;\n      __pyx_L11_bool_binop_done:;\n      __pyx_v_have_slices = __pyx_t_1;\n\n      /* \"View.MemoryView\":692\n * \n *             have_slices = have_slices or isinstance(item, slice)\n *             result.append(item)             # <<<<<<<<<<<<<<\n * \n *     nslices = ndim - len(result)\n */\n      __pyx_t_9 = __Pyx_PyList_Append(__pyx_v_result, __pyx_v_item); if (unlikely(__pyx_t_9 == ((int)-1))) __PYX_ERR(2, 692, __pyx_L1_error)\n    }\n    __pyx_L6:;\n\n    /* \"View.MemoryView\":679\n *     have_slices = False\n *     seen_ellipsis = False\n *     for idx, item in enumerate(tup):             # <<<<<<<<<<<<<<\n *         if item is Ellipsis:\n *             if not seen_ellipsis:\n */\n  }\n  __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;\n  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n\n  /* \"View.MemoryView\":694\n *             result.append(item)\n * \n *     nslices = ndim - len(result)             # <<<<<<<<<<<<<<\n *     if nslices:\n *         result.extend([slice(None)] * nslices)\n */\n  __pyx_t_5 = PyList_GET_SIZE(__pyx_v_result); if (unlikely(__pyx_t_5 == ((Py_ssize_t)-1))) __PYX_ERR(2, 694, __pyx_L1_error)\n  __pyx_v_nslices = (__pyx_v_ndim - __pyx_t_5);\n\n  /* \"View.MemoryView\":695\n * \n *     nslices = ndim - len(result)\n *     if nslices:             # <<<<<<<<<<<<<<\n *         result.extend([slice(None)] * nslices)\n * \n */\n  __pyx_t_1 = (__pyx_v_nslices != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":696\n *     nslices = ndim - len(result)\n *     if nslices:\n *         result.extend([slice(None)] * nslices)             # <<<<<<<<<<<<<<\n * \n *     return have_slices or nslices, tuple(result)\n */\n    __pyx_t_3 = PyList_New(1 * ((__pyx_v_nslices<0) ? 0:__pyx_v_nslices)); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 696, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    { Py_ssize_t __pyx_temp;\n      for (__pyx_temp=0; __pyx_temp < __pyx_v_nslices; __pyx_temp++) {\n        __Pyx_INCREF(__pyx_slice__18);\n        __Pyx_GIVEREF(__pyx_slice__18);\n        PyList_SET_ITEM(__pyx_t_3, __pyx_temp, __pyx_slice__18);\n      }\n    }\n    __pyx_t_9 = __Pyx_PyList_Extend(__pyx_v_result, __pyx_t_3); if (unlikely(__pyx_t_9 == ((int)-1))) __PYX_ERR(2, 696, __pyx_L1_error)\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n\n    /* \"View.MemoryView\":695\n * \n *     nslices = ndim - len(result)\n *     if nslices:             # <<<<<<<<<<<<<<\n *         result.extend([slice(None)] * nslices)\n * \n */\n  }\n\n  /* \"View.MemoryView\":698\n *         result.extend([slice(None)] * nslices)\n * \n *     return have_slices or nslices, tuple(result)             # <<<<<<<<<<<<<<\n * \n * cdef assert_direct_dimensions(Py_ssize_t *suboffsets, int ndim):\n */\n  __Pyx_XDECREF(__pyx_r);\n  if (!__pyx_v_have_slices) {\n  } else {\n    __pyx_t_4 = __Pyx_PyBool_FromLong(__pyx_v_have_slices); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 698, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __pyx_t_3 = __pyx_t_4;\n    __pyx_t_4 = 0;\n    goto __pyx_L14_bool_binop_done;\n  }\n  __pyx_t_4 = PyInt_FromSsize_t(__pyx_v_nslices); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 698, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_4);\n  __pyx_t_3 = __pyx_t_4;\n  __pyx_t_4 = 0;\n  __pyx_L14_bool_binop_done:;\n  __pyx_t_4 = PyList_AsTuple(__pyx_v_result); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 698, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_4);\n  __pyx_t_11 = PyTuple_New(2); if (unlikely(!__pyx_t_11)) __PYX_ERR(2, 698, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_11);\n  __Pyx_GIVEREF(__pyx_t_3);\n  PyTuple_SET_ITEM(__pyx_t_11, 0, __pyx_t_3);\n  __Pyx_GIVEREF(__pyx_t_4);\n  PyTuple_SET_ITEM(__pyx_t_11, 1, __pyx_t_4);\n  __pyx_t_3 = 0;\n  __pyx_t_4 = 0;\n  __pyx_r = ((PyObject*)__pyx_t_11);\n  __pyx_t_11 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":666\n *     return isinstance(o, memoryview)\n * \n * cdef tuple _unellipsify(object index, int ndim):             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     Replace all ellipses with full slices and fill incomplete indices with\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_XDECREF(__pyx_t_4);\n  __Pyx_XDECREF(__pyx_t_7);\n  __Pyx_XDECREF(__pyx_t_11);\n  __Pyx_AddTraceback(\"View.MemoryView._unellipsify\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XDECREF(__pyx_v_tup);\n  __Pyx_XDECREF(__pyx_v_result);\n  __Pyx_XDECREF(__pyx_v_idx);\n  __Pyx_XDECREF(__pyx_v_item);\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":700\n *     return have_slices or nslices, tuple(result)\n * \n * cdef assert_direct_dimensions(Py_ssize_t *suboffsets, int ndim):             # <<<<<<<<<<<<<<\n *     for suboffset in suboffsets[:ndim]:\n *         if suboffset >= 0:\n */\n\nstatic PyObject *assert_direct_dimensions(Py_ssize_t *__pyx_v_suboffsets, int __pyx_v_ndim) {\n  Py_ssize_t __pyx_v_suboffset;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  Py_ssize_t *__pyx_t_1;\n  Py_ssize_t *__pyx_t_2;\n  Py_ssize_t *__pyx_t_3;\n  int __pyx_t_4;\n  PyObject *__pyx_t_5 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"assert_direct_dimensions\", 0);\n\n  /* \"View.MemoryView\":701\n * \n * cdef assert_direct_dimensions(Py_ssize_t *suboffsets, int ndim):\n *     for suboffset in suboffsets[:ndim]:             # <<<<<<<<<<<<<<\n *         if suboffset >= 0:\n *             raise ValueError(\"Indirect dimensions not supported\")\n */\n  __pyx_t_2 = (__pyx_v_suboffsets + __pyx_v_ndim);\n  for (__pyx_t_3 = __pyx_v_suboffsets; __pyx_t_3 < __pyx_t_2; __pyx_t_3++) {\n    __pyx_t_1 = __pyx_t_3;\n    __pyx_v_suboffset = (__pyx_t_1[0]);\n\n    /* \"View.MemoryView\":702\n * cdef assert_direct_dimensions(Py_ssize_t *suboffsets, int ndim):\n *     for suboffset in suboffsets[:ndim]:\n *         if suboffset >= 0:             # <<<<<<<<<<<<<<\n *             raise ValueError(\"Indirect dimensions not supported\")\n * \n */\n    __pyx_t_4 = ((__pyx_v_suboffset >= 0) != 0);\n    if (unlikely(__pyx_t_4)) {\n\n      /* \"View.MemoryView\":703\n *     for suboffset in suboffsets[:ndim]:\n *         if suboffset >= 0:\n *             raise ValueError(\"Indirect dimensions not supported\")             # <<<<<<<<<<<<<<\n * \n * \n */\n      __pyx_t_5 = __Pyx_PyObject_Call(__pyx_builtin_ValueError, __pyx_tuple__19, NULL); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 703, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_5);\n      __Pyx_Raise(__pyx_t_5, 0, 0, 0);\n      __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0;\n      __PYX_ERR(2, 703, __pyx_L1_error)\n\n      /* \"View.MemoryView\":702\n * cdef assert_direct_dimensions(Py_ssize_t *suboffsets, int ndim):\n *     for suboffset in suboffsets[:ndim]:\n *         if suboffset >= 0:             # <<<<<<<<<<<<<<\n *             raise ValueError(\"Indirect dimensions not supported\")\n * \n */\n    }\n  }\n\n  /* \"View.MemoryView\":700\n *     return have_slices or nslices, tuple(result)\n * \n * cdef assert_direct_dimensions(Py_ssize_t *suboffsets, int ndim):             # <<<<<<<<<<<<<<\n *     for suboffset in suboffsets[:ndim]:\n *         if suboffset >= 0:\n */\n\n  /* function exit code */\n  __pyx_r = Py_None; __Pyx_INCREF(Py_None);\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_AddTraceback(\"View.MemoryView.assert_direct_dimensions\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":710\n * \n * @cname('__pyx_memview_slice')\n * cdef memoryview memview_slice(memoryview memview, object indices):             # <<<<<<<<<<<<<<\n *     cdef int new_ndim = 0, suboffset_dim = -1, dim\n *     cdef bint negative_step\n */\n\nstatic struct __pyx_memoryview_obj *__pyx_memview_slice(struct __pyx_memoryview_obj *__pyx_v_memview, PyObject *__pyx_v_indices) {\n  int __pyx_v_new_ndim;\n  int __pyx_v_suboffset_dim;\n  int __pyx_v_dim;\n  __Pyx_memviewslice __pyx_v_src;\n  __Pyx_memviewslice __pyx_v_dst;\n  __Pyx_memviewslice *__pyx_v_p_src;\n  struct __pyx_memoryviewslice_obj *__pyx_v_memviewsliceobj = 0;\n  __Pyx_memviewslice *__pyx_v_p_dst;\n  int *__pyx_v_p_suboffset_dim;\n  Py_ssize_t __pyx_v_start;\n  Py_ssize_t __pyx_v_stop;\n  Py_ssize_t __pyx_v_step;\n  int __pyx_v_have_start;\n  int __pyx_v_have_stop;\n  int __pyx_v_have_step;\n  PyObject *__pyx_v_index = NULL;\n  struct __pyx_memoryview_obj *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  int __pyx_t_2;\n  PyObject *__pyx_t_3 = NULL;\n  struct __pyx_memoryview_obj *__pyx_t_4;\n  char *__pyx_t_5;\n  int __pyx_t_6;\n  Py_ssize_t __pyx_t_7;\n  PyObject *(*__pyx_t_8)(PyObject *);\n  PyObject *__pyx_t_9 = NULL;\n  Py_ssize_t __pyx_t_10;\n  int __pyx_t_11;\n  Py_ssize_t __pyx_t_12;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"memview_slice\", 0);\n\n  /* \"View.MemoryView\":711\n * @cname('__pyx_memview_slice')\n * cdef memoryview memview_slice(memoryview memview, object indices):\n *     cdef int new_ndim = 0, suboffset_dim = -1, dim             # <<<<<<<<<<<<<<\n *     cdef bint negative_step\n *     cdef __Pyx_memviewslice src, dst\n */\n  __pyx_v_new_ndim = 0;\n  __pyx_v_suboffset_dim = -1;\n\n  /* \"View.MemoryView\":718\n * \n * \n *     memset(&dst, 0, sizeof(dst))             # <<<<<<<<<<<<<<\n * \n *     cdef _memoryviewslice memviewsliceobj\n */\n  (void)(memset((&__pyx_v_dst), 0, (sizeof(__pyx_v_dst))));\n\n  /* \"View.MemoryView\":722\n *     cdef _memoryviewslice memviewsliceobj\n * \n *     assert memview.view.ndim > 0             # <<<<<<<<<<<<<<\n * \n *     if isinstance(memview, _memoryviewslice):\n */\n  #ifndef CYTHON_WITHOUT_ASSERTIONS\n  if (unlikely(!Py_OptimizeFlag)) {\n    if (unlikely(!((__pyx_v_memview->view.ndim > 0) != 0))) {\n      PyErr_SetNone(PyExc_AssertionError);\n      __PYX_ERR(2, 722, __pyx_L1_error)\n    }\n  }\n  #endif\n\n  /* \"View.MemoryView\":724\n *     assert memview.view.ndim > 0\n * \n *     if isinstance(memview, _memoryviewslice):             # <<<<<<<<<<<<<<\n *         memviewsliceobj = memview\n *         p_src = &memviewsliceobj.from_slice\n */\n  __pyx_t_1 = __Pyx_TypeCheck(((PyObject *)__pyx_v_memview), __pyx_memoryviewslice_type); \n  __pyx_t_2 = (__pyx_t_1 != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":725\n * \n *     if isinstance(memview, _memoryviewslice):\n *         memviewsliceobj = memview             # <<<<<<<<<<<<<<\n *         p_src = &memviewsliceobj.from_slice\n *     else:\n */\n    if (!(likely(((((PyObject *)__pyx_v_memview)) == Py_None) || likely(__Pyx_TypeTest(((PyObject *)__pyx_v_memview), __pyx_memoryviewslice_type))))) __PYX_ERR(2, 725, __pyx_L1_error)\n    __pyx_t_3 = ((PyObject *)__pyx_v_memview);\n    __Pyx_INCREF(__pyx_t_3);\n    __pyx_v_memviewsliceobj = ((struct __pyx_memoryviewslice_obj *)__pyx_t_3);\n    __pyx_t_3 = 0;\n\n    /* \"View.MemoryView\":726\n *     if isinstance(memview, _memoryviewslice):\n *         memviewsliceobj = memview\n *         p_src = &memviewsliceobj.from_slice             # <<<<<<<<<<<<<<\n *     else:\n *         slice_copy(memview, &src)\n */\n    __pyx_v_p_src = (&__pyx_v_memviewsliceobj->from_slice);\n\n    /* \"View.MemoryView\":724\n *     assert memview.view.ndim > 0\n * \n *     if isinstance(memview, _memoryviewslice):             # <<<<<<<<<<<<<<\n *         memviewsliceobj = memview\n *         p_src = &memviewsliceobj.from_slice\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":728\n *         p_src = &memviewsliceobj.from_slice\n *     else:\n *         slice_copy(memview, &src)             # <<<<<<<<<<<<<<\n *         p_src = &src\n * \n */\n  /*else*/ {\n    __pyx_memoryview_slice_copy(__pyx_v_memview, (&__pyx_v_src));\n\n    /* \"View.MemoryView\":729\n *     else:\n *         slice_copy(memview, &src)\n *         p_src = &src             # <<<<<<<<<<<<<<\n * \n * \n */\n    __pyx_v_p_src = (&__pyx_v_src);\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":735\n * \n * \n *     dst.memview = p_src.memview             # <<<<<<<<<<<<<<\n *     dst.data = p_src.data\n * \n */\n  __pyx_t_4 = __pyx_v_p_src->memview;\n  __pyx_v_dst.memview = __pyx_t_4;\n\n  /* \"View.MemoryView\":736\n * \n *     dst.memview = p_src.memview\n *     dst.data = p_src.data             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_t_5 = __pyx_v_p_src->data;\n  __pyx_v_dst.data = __pyx_t_5;\n\n  /* \"View.MemoryView\":741\n * \n * \n *     cdef __Pyx_memviewslice *p_dst = &dst             # <<<<<<<<<<<<<<\n *     cdef int *p_suboffset_dim = &suboffset_dim\n *     cdef Py_ssize_t start, stop, step\n */\n  __pyx_v_p_dst = (&__pyx_v_dst);\n\n  /* \"View.MemoryView\":742\n * \n *     cdef __Pyx_memviewslice *p_dst = &dst\n *     cdef int *p_suboffset_dim = &suboffset_dim             # <<<<<<<<<<<<<<\n *     cdef Py_ssize_t start, stop, step\n *     cdef bint have_start, have_stop, have_step\n */\n  __pyx_v_p_suboffset_dim = (&__pyx_v_suboffset_dim);\n\n  /* \"View.MemoryView\":746\n *     cdef bint have_start, have_stop, have_step\n * \n *     for dim, index in enumerate(indices):             # <<<<<<<<<<<<<<\n *         if PyIndex_Check(index):\n *             slice_memviewslice(\n */\n  __pyx_t_6 = 0;\n  if (likely(PyList_CheckExact(__pyx_v_indices)) || PyTuple_CheckExact(__pyx_v_indices)) {\n    __pyx_t_3 = __pyx_v_indices; __Pyx_INCREF(__pyx_t_3); __pyx_t_7 = 0;\n    __pyx_t_8 = NULL;\n  } else {\n    __pyx_t_7 = -1; __pyx_t_3 = PyObject_GetIter(__pyx_v_indices); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 746, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __pyx_t_8 = Py_TYPE(__pyx_t_3)->tp_iternext; if (unlikely(!__pyx_t_8)) __PYX_ERR(2, 746, __pyx_L1_error)\n  }\n  for (;;) {\n    if (likely(!__pyx_t_8)) {\n      if (likely(PyList_CheckExact(__pyx_t_3))) {\n        if (__pyx_t_7 >= PyList_GET_SIZE(__pyx_t_3)) break;\n        #if CYTHON_ASSUME_SAFE_MACROS && !CYTHON_AVOID_BORROWED_REFS\n        __pyx_t_9 = PyList_GET_ITEM(__pyx_t_3, __pyx_t_7); __Pyx_INCREF(__pyx_t_9); __pyx_t_7++; if (unlikely(0 < 0)) __PYX_ERR(2, 746, __pyx_L1_error)\n        #else\n        __pyx_t_9 = PySequence_ITEM(__pyx_t_3, __pyx_t_7); __pyx_t_7++; if (unlikely(!__pyx_t_9)) __PYX_ERR(2, 746, __pyx_L1_error)\n        __Pyx_GOTREF(__pyx_t_9);\n        #endif\n      } else {\n        if (__pyx_t_7 >= PyTuple_GET_SIZE(__pyx_t_3)) break;\n        #if CYTHON_ASSUME_SAFE_MACROS && !CYTHON_AVOID_BORROWED_REFS\n        __pyx_t_9 = PyTuple_GET_ITEM(__pyx_t_3, __pyx_t_7); __Pyx_INCREF(__pyx_t_9); __pyx_t_7++; if (unlikely(0 < 0)) __PYX_ERR(2, 746, __pyx_L1_error)\n        #else\n        __pyx_t_9 = PySequence_ITEM(__pyx_t_3, __pyx_t_7); __pyx_t_7++; if (unlikely(!__pyx_t_9)) __PYX_ERR(2, 746, __pyx_L1_error)\n        __Pyx_GOTREF(__pyx_t_9);\n        #endif\n      }\n    } else {\n      __pyx_t_9 = __pyx_t_8(__pyx_t_3);\n      if (unlikely(!__pyx_t_9)) {\n        PyObject* exc_type = PyErr_Occurred();\n        if (exc_type) {\n          if (likely(__Pyx_PyErr_GivenExceptionMatches(exc_type, PyExc_StopIteration))) PyErr_Clear();\n          else __PYX_ERR(2, 746, __pyx_L1_error)\n        }\n        break;\n      }\n      __Pyx_GOTREF(__pyx_t_9);\n    }\n    __Pyx_XDECREF_SET(__pyx_v_index, __pyx_t_9);\n    __pyx_t_9 = 0;\n    __pyx_v_dim = __pyx_t_6;\n    __pyx_t_6 = (__pyx_t_6 + 1);\n\n    /* \"View.MemoryView\":747\n * \n *     for dim, index in enumerate(indices):\n *         if PyIndex_Check(index):             # <<<<<<<<<<<<<<\n *             slice_memviewslice(\n *                 p_dst, p_src.shape[dim], p_src.strides[dim], p_src.suboffsets[dim],\n */\n    __pyx_t_2 = (PyIndex_Check(__pyx_v_index) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":751\n *                 p_dst, p_src.shape[dim], p_src.strides[dim], p_src.suboffsets[dim],\n *                 dim, new_ndim, p_suboffset_dim,\n *                 index, 0, 0, # start, stop, step             # <<<<<<<<<<<<<<\n *                 0, 0, 0, # have_{start,stop,step}\n *                 False)\n */\n      __pyx_t_10 = __Pyx_PyIndex_AsSsize_t(__pyx_v_index); if (unlikely((__pyx_t_10 == (Py_ssize_t)-1) && PyErr_Occurred())) __PYX_ERR(2, 751, __pyx_L1_error)\n\n      /* \"View.MemoryView\":748\n *     for dim, index in enumerate(indices):\n *         if PyIndex_Check(index):\n *             slice_memviewslice(             # <<<<<<<<<<<<<<\n *                 p_dst, p_src.shape[dim], p_src.strides[dim], p_src.suboffsets[dim],\n *                 dim, new_ndim, p_suboffset_dim,\n */\n      __pyx_t_11 = __pyx_memoryview_slice_memviewslice(__pyx_v_p_dst, (__pyx_v_p_src->shape[__pyx_v_dim]), (__pyx_v_p_src->strides[__pyx_v_dim]), (__pyx_v_p_src->suboffsets[__pyx_v_dim]), __pyx_v_dim, __pyx_v_new_ndim, __pyx_v_p_suboffset_dim, __pyx_t_10, 0, 0, 0, 0, 0, 0); if (unlikely(__pyx_t_11 == ((int)-1))) __PYX_ERR(2, 748, __pyx_L1_error)\n\n      /* \"View.MemoryView\":747\n * \n *     for dim, index in enumerate(indices):\n *         if PyIndex_Check(index):             # <<<<<<<<<<<<<<\n *             slice_memviewslice(\n *                 p_dst, p_src.shape[dim], p_src.strides[dim], p_src.suboffsets[dim],\n */\n      goto __pyx_L6;\n    }\n\n    /* \"View.MemoryView\":754\n *                 0, 0, 0, # have_{start,stop,step}\n *                 False)\n *         elif index is None:             # <<<<<<<<<<<<<<\n *             p_dst.shape[new_ndim] = 1\n *             p_dst.strides[new_ndim] = 0\n */\n    __pyx_t_2 = (__pyx_v_index == Py_None);\n    __pyx_t_1 = (__pyx_t_2 != 0);\n    if (__pyx_t_1) {\n\n      /* \"View.MemoryView\":755\n *                 False)\n *         elif index is None:\n *             p_dst.shape[new_ndim] = 1             # <<<<<<<<<<<<<<\n *             p_dst.strides[new_ndim] = 0\n *             p_dst.suboffsets[new_ndim] = -1\n */\n      (__pyx_v_p_dst->shape[__pyx_v_new_ndim]) = 1;\n\n      /* \"View.MemoryView\":756\n *         elif index is None:\n *             p_dst.shape[new_ndim] = 1\n *             p_dst.strides[new_ndim] = 0             # <<<<<<<<<<<<<<\n *             p_dst.suboffsets[new_ndim] = -1\n *             new_ndim += 1\n */\n      (__pyx_v_p_dst->strides[__pyx_v_new_ndim]) = 0;\n\n      /* \"View.MemoryView\":757\n *             p_dst.shape[new_ndim] = 1\n *             p_dst.strides[new_ndim] = 0\n *             p_dst.suboffsets[new_ndim] = -1             # <<<<<<<<<<<<<<\n *             new_ndim += 1\n *         else:\n */\n      (__pyx_v_p_dst->suboffsets[__pyx_v_new_ndim]) = -1L;\n\n      /* \"View.MemoryView\":758\n *             p_dst.strides[new_ndim] = 0\n *             p_dst.suboffsets[new_ndim] = -1\n *             new_ndim += 1             # <<<<<<<<<<<<<<\n *         else:\n *             start = index.start or 0\n */\n      __pyx_v_new_ndim = (__pyx_v_new_ndim + 1);\n\n      /* \"View.MemoryView\":754\n *                 0, 0, 0, # have_{start,stop,step}\n *                 False)\n *         elif index is None:             # <<<<<<<<<<<<<<\n *             p_dst.shape[new_ndim] = 1\n *             p_dst.strides[new_ndim] = 0\n */\n      goto __pyx_L6;\n    }\n\n    /* \"View.MemoryView\":760\n *             new_ndim += 1\n *         else:\n *             start = index.start or 0             # <<<<<<<<<<<<<<\n *             stop = index.stop or 0\n *             step = index.step or 0\n */\n    /*else*/ {\n      __pyx_t_9 = __Pyx_PyObject_GetAttrStr(__pyx_v_index, __pyx_n_s_start); if (unlikely(!__pyx_t_9)) __PYX_ERR(2, 760, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_9);\n      __pyx_t_1 = __Pyx_PyObject_IsTrue(__pyx_t_9); if (unlikely(__pyx_t_1 < 0)) __PYX_ERR(2, 760, __pyx_L1_error)\n      if (!__pyx_t_1) {\n        __Pyx_DECREF(__pyx_t_9); __pyx_t_9 = 0;\n      } else {\n        __pyx_t_12 = __Pyx_PyIndex_AsSsize_t(__pyx_t_9); if (unlikely((__pyx_t_12 == (Py_ssize_t)-1) && PyErr_Occurred())) __PYX_ERR(2, 760, __pyx_L1_error)\n        __pyx_t_10 = __pyx_t_12;\n        __Pyx_DECREF(__pyx_t_9); __pyx_t_9 = 0;\n        goto __pyx_L7_bool_binop_done;\n      }\n      __pyx_t_10 = 0;\n      __pyx_L7_bool_binop_done:;\n      __pyx_v_start = __pyx_t_10;\n\n      /* \"View.MemoryView\":761\n *         else:\n *             start = index.start or 0\n *             stop = index.stop or 0             # <<<<<<<<<<<<<<\n *             step = index.step or 0\n * \n */\n      __pyx_t_9 = __Pyx_PyObject_GetAttrStr(__pyx_v_index, __pyx_n_s_stop); if (unlikely(!__pyx_t_9)) __PYX_ERR(2, 761, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_9);\n      __pyx_t_1 = __Pyx_PyObject_IsTrue(__pyx_t_9); if (unlikely(__pyx_t_1 < 0)) __PYX_ERR(2, 761, __pyx_L1_error)\n      if (!__pyx_t_1) {\n        __Pyx_DECREF(__pyx_t_9); __pyx_t_9 = 0;\n      } else {\n        __pyx_t_12 = __Pyx_PyIndex_AsSsize_t(__pyx_t_9); if (unlikely((__pyx_t_12 == (Py_ssize_t)-1) && PyErr_Occurred())) __PYX_ERR(2, 761, __pyx_L1_error)\n        __pyx_t_10 = __pyx_t_12;\n        __Pyx_DECREF(__pyx_t_9); __pyx_t_9 = 0;\n        goto __pyx_L9_bool_binop_done;\n      }\n      __pyx_t_10 = 0;\n      __pyx_L9_bool_binop_done:;\n      __pyx_v_stop = __pyx_t_10;\n\n      /* \"View.MemoryView\":762\n *             start = index.start or 0\n *             stop = index.stop or 0\n *             step = index.step or 0             # <<<<<<<<<<<<<<\n * \n *             have_start = index.start is not None\n */\n      __pyx_t_9 = __Pyx_PyObject_GetAttrStr(__pyx_v_index, __pyx_n_s_step); if (unlikely(!__pyx_t_9)) __PYX_ERR(2, 762, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_9);\n      __pyx_t_1 = __Pyx_PyObject_IsTrue(__pyx_t_9); if (unlikely(__pyx_t_1 < 0)) __PYX_ERR(2, 762, __pyx_L1_error)\n      if (!__pyx_t_1) {\n        __Pyx_DECREF(__pyx_t_9); __pyx_t_9 = 0;\n      } else {\n        __pyx_t_12 = __Pyx_PyIndex_AsSsize_t(__pyx_t_9); if (unlikely((__pyx_t_12 == (Py_ssize_t)-1) && PyErr_Occurred())) __PYX_ERR(2, 762, __pyx_L1_error)\n        __pyx_t_10 = __pyx_t_12;\n        __Pyx_DECREF(__pyx_t_9); __pyx_t_9 = 0;\n        goto __pyx_L11_bool_binop_done;\n      }\n      __pyx_t_10 = 0;\n      __pyx_L11_bool_binop_done:;\n      __pyx_v_step = __pyx_t_10;\n\n      /* \"View.MemoryView\":764\n *             step = index.step or 0\n * \n *             have_start = index.start is not None             # <<<<<<<<<<<<<<\n *             have_stop = index.stop is not None\n *             have_step = index.step is not None\n */\n      __pyx_t_9 = __Pyx_PyObject_GetAttrStr(__pyx_v_index, __pyx_n_s_start); if (unlikely(!__pyx_t_9)) __PYX_ERR(2, 764, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_9);\n      __pyx_t_1 = (__pyx_t_9 != Py_None);\n      __Pyx_DECREF(__pyx_t_9); __pyx_t_9 = 0;\n      __pyx_v_have_start = __pyx_t_1;\n\n      /* \"View.MemoryView\":765\n * \n *             have_start = index.start is not None\n *             have_stop = index.stop is not None             # <<<<<<<<<<<<<<\n *             have_step = index.step is not None\n * \n */\n      __pyx_t_9 = __Pyx_PyObject_GetAttrStr(__pyx_v_index, __pyx_n_s_stop); if (unlikely(!__pyx_t_9)) __PYX_ERR(2, 765, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_9);\n      __pyx_t_1 = (__pyx_t_9 != Py_None);\n      __Pyx_DECREF(__pyx_t_9); __pyx_t_9 = 0;\n      __pyx_v_have_stop = __pyx_t_1;\n\n      /* \"View.MemoryView\":766\n *             have_start = index.start is not None\n *             have_stop = index.stop is not None\n *             have_step = index.step is not None             # <<<<<<<<<<<<<<\n * \n *             slice_memviewslice(\n */\n      __pyx_t_9 = __Pyx_PyObject_GetAttrStr(__pyx_v_index, __pyx_n_s_step); if (unlikely(!__pyx_t_9)) __PYX_ERR(2, 766, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_9);\n      __pyx_t_1 = (__pyx_t_9 != Py_None);\n      __Pyx_DECREF(__pyx_t_9); __pyx_t_9 = 0;\n      __pyx_v_have_step = __pyx_t_1;\n\n      /* \"View.MemoryView\":768\n *             have_step = index.step is not None\n * \n *             slice_memviewslice(             # <<<<<<<<<<<<<<\n *                 p_dst, p_src.shape[dim], p_src.strides[dim], p_src.suboffsets[dim],\n *                 dim, new_ndim, p_suboffset_dim,\n */\n      __pyx_t_11 = __pyx_memoryview_slice_memviewslice(__pyx_v_p_dst, (__pyx_v_p_src->shape[__pyx_v_dim]), (__pyx_v_p_src->strides[__pyx_v_dim]), (__pyx_v_p_src->suboffsets[__pyx_v_dim]), __pyx_v_dim, __pyx_v_new_ndim, __pyx_v_p_suboffset_dim, __pyx_v_start, __pyx_v_stop, __pyx_v_step, __pyx_v_have_start, __pyx_v_have_stop, __pyx_v_have_step, 1); if (unlikely(__pyx_t_11 == ((int)-1))) __PYX_ERR(2, 768, __pyx_L1_error)\n\n      /* \"View.MemoryView\":774\n *                 have_start, have_stop, have_step,\n *                 True)\n *             new_ndim += 1             # <<<<<<<<<<<<<<\n * \n *     if isinstance(memview, _memoryviewslice):\n */\n      __pyx_v_new_ndim = (__pyx_v_new_ndim + 1);\n    }\n    __pyx_L6:;\n\n    /* \"View.MemoryView\":746\n *     cdef bint have_start, have_stop, have_step\n * \n *     for dim, index in enumerate(indices):             # <<<<<<<<<<<<<<\n *         if PyIndex_Check(index):\n *             slice_memviewslice(\n */\n  }\n  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n\n  /* \"View.MemoryView\":776\n *             new_ndim += 1\n * \n *     if isinstance(memview, _memoryviewslice):             # <<<<<<<<<<<<<<\n *         return memoryview_fromslice(dst, new_ndim,\n *                                     memviewsliceobj.to_object_func,\n */\n  __pyx_t_1 = __Pyx_TypeCheck(((PyObject *)__pyx_v_memview), __pyx_memoryviewslice_type); \n  __pyx_t_2 = (__pyx_t_1 != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":777\n * \n *     if isinstance(memview, _memoryviewslice):\n *         return memoryview_fromslice(dst, new_ndim,             # <<<<<<<<<<<<<<\n *                                     memviewsliceobj.to_object_func,\n *                                     memviewsliceobj.to_dtype_func,\n */\n    __Pyx_XDECREF(((PyObject *)__pyx_r));\n\n    /* \"View.MemoryView\":778\n *     if isinstance(memview, _memoryviewslice):\n *         return memoryview_fromslice(dst, new_ndim,\n *                                     memviewsliceobj.to_object_func,             # <<<<<<<<<<<<<<\n *                                     memviewsliceobj.to_dtype_func,\n *                                     memview.dtype_is_object)\n */\n    if (unlikely(!__pyx_v_memviewsliceobj)) { __Pyx_RaiseUnboundLocalError(\"memviewsliceobj\"); __PYX_ERR(2, 778, __pyx_L1_error) }\n\n    /* \"View.MemoryView\":779\n *         return memoryview_fromslice(dst, new_ndim,\n *                                     memviewsliceobj.to_object_func,\n *                                     memviewsliceobj.to_dtype_func,             # <<<<<<<<<<<<<<\n *                                     memview.dtype_is_object)\n *     else:\n */\n    if (unlikely(!__pyx_v_memviewsliceobj)) { __Pyx_RaiseUnboundLocalError(\"memviewsliceobj\"); __PYX_ERR(2, 779, __pyx_L1_error) }\n\n    /* \"View.MemoryView\":777\n * \n *     if isinstance(memview, _memoryviewslice):\n *         return memoryview_fromslice(dst, new_ndim,             # <<<<<<<<<<<<<<\n *                                     memviewsliceobj.to_object_func,\n *                                     memviewsliceobj.to_dtype_func,\n */\n    __pyx_t_3 = __pyx_memoryview_fromslice(__pyx_v_dst, __pyx_v_new_ndim, __pyx_v_memviewsliceobj->to_object_func, __pyx_v_memviewsliceobj->to_dtype_func, __pyx_v_memview->dtype_is_object); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 777, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    if (!(likely(((__pyx_t_3) == Py_None) || likely(__Pyx_TypeTest(__pyx_t_3, __pyx_memoryview_type))))) __PYX_ERR(2, 777, __pyx_L1_error)\n    __pyx_r = ((struct __pyx_memoryview_obj *)__pyx_t_3);\n    __pyx_t_3 = 0;\n    goto __pyx_L0;\n\n    /* \"View.MemoryView\":776\n *             new_ndim += 1\n * \n *     if isinstance(memview, _memoryviewslice):             # <<<<<<<<<<<<<<\n *         return memoryview_fromslice(dst, new_ndim,\n *                                     memviewsliceobj.to_object_func,\n */\n  }\n\n  /* \"View.MemoryView\":782\n *                                     memview.dtype_is_object)\n *     else:\n *         return memoryview_fromslice(dst, new_ndim, NULL, NULL,             # <<<<<<<<<<<<<<\n *                                     memview.dtype_is_object)\n * \n */\n  /*else*/ {\n    __Pyx_XDECREF(((PyObject *)__pyx_r));\n\n    /* \"View.MemoryView\":783\n *     else:\n *         return memoryview_fromslice(dst, new_ndim, NULL, NULL,\n *                                     memview.dtype_is_object)             # <<<<<<<<<<<<<<\n * \n * \n */\n    __pyx_t_3 = __pyx_memoryview_fromslice(__pyx_v_dst, __pyx_v_new_ndim, NULL, NULL, __pyx_v_memview->dtype_is_object); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 782, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n\n    /* \"View.MemoryView\":782\n *                                     memview.dtype_is_object)\n *     else:\n *         return memoryview_fromslice(dst, new_ndim, NULL, NULL,             # <<<<<<<<<<<<<<\n *                                     memview.dtype_is_object)\n * \n */\n    if (!(likely(((__pyx_t_3) == Py_None) || likely(__Pyx_TypeTest(__pyx_t_3, __pyx_memoryview_type))))) __PYX_ERR(2, 782, __pyx_L1_error)\n    __pyx_r = ((struct __pyx_memoryview_obj *)__pyx_t_3);\n    __pyx_t_3 = 0;\n    goto __pyx_L0;\n  }\n\n  /* \"View.MemoryView\":710\n * \n * @cname('__pyx_memview_slice')\n * cdef memoryview memview_slice(memoryview memview, object indices):             # <<<<<<<<<<<<<<\n *     cdef int new_ndim = 0, suboffset_dim = -1, dim\n *     cdef bint negative_step\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_XDECREF(__pyx_t_9);\n  __Pyx_AddTraceback(\"View.MemoryView.memview_slice\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XDECREF((PyObject *)__pyx_v_memviewsliceobj);\n  __Pyx_XDECREF(__pyx_v_index);\n  __Pyx_XGIVEREF((PyObject *)__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":807\n * \n * @cname('__pyx_memoryview_slice_memviewslice')\n * cdef int slice_memviewslice(             # <<<<<<<<<<<<<<\n *         __Pyx_memviewslice *dst,\n *         Py_ssize_t shape, Py_ssize_t stride, Py_ssize_t suboffset,\n */\n\nstatic int __pyx_memoryview_slice_memviewslice(__Pyx_memviewslice *__pyx_v_dst, Py_ssize_t __pyx_v_shape, Py_ssize_t __pyx_v_stride, Py_ssize_t __pyx_v_suboffset, int __pyx_v_dim, int __pyx_v_new_ndim, int *__pyx_v_suboffset_dim, Py_ssize_t __pyx_v_start, Py_ssize_t __pyx_v_stop, Py_ssize_t __pyx_v_step, int __pyx_v_have_start, int __pyx_v_have_stop, int __pyx_v_have_step, int __pyx_v_is_slice) {\n  Py_ssize_t __pyx_v_new_shape;\n  int __pyx_v_negative_step;\n  int __pyx_r;\n  int __pyx_t_1;\n  int __pyx_t_2;\n  int __pyx_t_3;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n\n  /* \"View.MemoryView\":827\n *     cdef bint negative_step\n * \n *     if not is_slice:             # <<<<<<<<<<<<<<\n * \n *         if start < 0:\n */\n  __pyx_t_1 = ((!(__pyx_v_is_slice != 0)) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":829\n *     if not is_slice:\n * \n *         if start < 0:             # <<<<<<<<<<<<<<\n *             start += shape\n *         if not 0 <= start < shape:\n */\n    __pyx_t_1 = ((__pyx_v_start < 0) != 0);\n    if (__pyx_t_1) {\n\n      /* \"View.MemoryView\":830\n * \n *         if start < 0:\n *             start += shape             # <<<<<<<<<<<<<<\n *         if not 0 <= start < shape:\n *             _err_dim(IndexError, \"Index out of bounds (axis %d)\", dim)\n */\n      __pyx_v_start = (__pyx_v_start + __pyx_v_shape);\n\n      /* \"View.MemoryView\":829\n *     if not is_slice:\n * \n *         if start < 0:             # <<<<<<<<<<<<<<\n *             start += shape\n *         if not 0 <= start < shape:\n */\n    }\n\n    /* \"View.MemoryView\":831\n *         if start < 0:\n *             start += shape\n *         if not 0 <= start < shape:             # <<<<<<<<<<<<<<\n *             _err_dim(IndexError, \"Index out of bounds (axis %d)\", dim)\n *     else:\n */\n    __pyx_t_1 = (0 <= __pyx_v_start);\n    if (__pyx_t_1) {\n      __pyx_t_1 = (__pyx_v_start < __pyx_v_shape);\n    }\n    __pyx_t_2 = ((!(__pyx_t_1 != 0)) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":832\n *             start += shape\n *         if not 0 <= start < shape:\n *             _err_dim(IndexError, \"Index out of bounds (axis %d)\", dim)             # <<<<<<<<<<<<<<\n *     else:\n * \n */\n      __pyx_t_3 = __pyx_memoryview_err_dim(__pyx_builtin_IndexError, ((char *)\"Index out of bounds (axis %d)\"), __pyx_v_dim); if (unlikely(__pyx_t_3 == ((int)-1))) __PYX_ERR(2, 832, __pyx_L1_error)\n\n      /* \"View.MemoryView\":831\n *         if start < 0:\n *             start += shape\n *         if not 0 <= start < shape:             # <<<<<<<<<<<<<<\n *             _err_dim(IndexError, \"Index out of bounds (axis %d)\", dim)\n *     else:\n */\n    }\n\n    /* \"View.MemoryView\":827\n *     cdef bint negative_step\n * \n *     if not is_slice:             # <<<<<<<<<<<<<<\n * \n *         if start < 0:\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":835\n *     else:\n * \n *         negative_step = have_step != 0 and step < 0             # <<<<<<<<<<<<<<\n * \n *         if have_step and step == 0:\n */\n  /*else*/ {\n    __pyx_t_1 = ((__pyx_v_have_step != 0) != 0);\n    if (__pyx_t_1) {\n    } else {\n      __pyx_t_2 = __pyx_t_1;\n      goto __pyx_L6_bool_binop_done;\n    }\n    __pyx_t_1 = ((__pyx_v_step < 0) != 0);\n    __pyx_t_2 = __pyx_t_1;\n    __pyx_L6_bool_binop_done:;\n    __pyx_v_negative_step = __pyx_t_2;\n\n    /* \"View.MemoryView\":837\n *         negative_step = have_step != 0 and step < 0\n * \n *         if have_step and step == 0:             # <<<<<<<<<<<<<<\n *             _err_dim(ValueError, \"Step may not be zero (axis %d)\", dim)\n * \n */\n    __pyx_t_1 = (__pyx_v_have_step != 0);\n    if (__pyx_t_1) {\n    } else {\n      __pyx_t_2 = __pyx_t_1;\n      goto __pyx_L9_bool_binop_done;\n    }\n    __pyx_t_1 = ((__pyx_v_step == 0) != 0);\n    __pyx_t_2 = __pyx_t_1;\n    __pyx_L9_bool_binop_done:;\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":838\n * \n *         if have_step and step == 0:\n *             _err_dim(ValueError, \"Step may not be zero (axis %d)\", dim)             # <<<<<<<<<<<<<<\n * \n * \n */\n      __pyx_t_3 = __pyx_memoryview_err_dim(__pyx_builtin_ValueError, ((char *)\"Step may not be zero (axis %d)\"), __pyx_v_dim); if (unlikely(__pyx_t_3 == ((int)-1))) __PYX_ERR(2, 838, __pyx_L1_error)\n\n      /* \"View.MemoryView\":837\n *         negative_step = have_step != 0 and step < 0\n * \n *         if have_step and step == 0:             # <<<<<<<<<<<<<<\n *             _err_dim(ValueError, \"Step may not be zero (axis %d)\", dim)\n * \n */\n    }\n\n    /* \"View.MemoryView\":841\n * \n * \n *         if have_start:             # <<<<<<<<<<<<<<\n *             if start < 0:\n *                 start += shape\n */\n    __pyx_t_2 = (__pyx_v_have_start != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":842\n * \n *         if have_start:\n *             if start < 0:             # <<<<<<<<<<<<<<\n *                 start += shape\n *                 if start < 0:\n */\n      __pyx_t_2 = ((__pyx_v_start < 0) != 0);\n      if (__pyx_t_2) {\n\n        /* \"View.MemoryView\":843\n *         if have_start:\n *             if start < 0:\n *                 start += shape             # <<<<<<<<<<<<<<\n *                 if start < 0:\n *                     start = 0\n */\n        __pyx_v_start = (__pyx_v_start + __pyx_v_shape);\n\n        /* \"View.MemoryView\":844\n *             if start < 0:\n *                 start += shape\n *                 if start < 0:             # <<<<<<<<<<<<<<\n *                     start = 0\n *             elif start >= shape:\n */\n        __pyx_t_2 = ((__pyx_v_start < 0) != 0);\n        if (__pyx_t_2) {\n\n          /* \"View.MemoryView\":845\n *                 start += shape\n *                 if start < 0:\n *                     start = 0             # <<<<<<<<<<<<<<\n *             elif start >= shape:\n *                 if negative_step:\n */\n          __pyx_v_start = 0;\n\n          /* \"View.MemoryView\":844\n *             if start < 0:\n *                 start += shape\n *                 if start < 0:             # <<<<<<<<<<<<<<\n *                     start = 0\n *             elif start >= shape:\n */\n        }\n\n        /* \"View.MemoryView\":842\n * \n *         if have_start:\n *             if start < 0:             # <<<<<<<<<<<<<<\n *                 start += shape\n *                 if start < 0:\n */\n        goto __pyx_L12;\n      }\n\n      /* \"View.MemoryView\":846\n *                 if start < 0:\n *                     start = 0\n *             elif start >= shape:             # <<<<<<<<<<<<<<\n *                 if negative_step:\n *                     start = shape - 1\n */\n      __pyx_t_2 = ((__pyx_v_start >= __pyx_v_shape) != 0);\n      if (__pyx_t_2) {\n\n        /* \"View.MemoryView\":847\n *                     start = 0\n *             elif start >= shape:\n *                 if negative_step:             # <<<<<<<<<<<<<<\n *                     start = shape - 1\n *                 else:\n */\n        __pyx_t_2 = (__pyx_v_negative_step != 0);\n        if (__pyx_t_2) {\n\n          /* \"View.MemoryView\":848\n *             elif start >= shape:\n *                 if negative_step:\n *                     start = shape - 1             # <<<<<<<<<<<<<<\n *                 else:\n *                     start = shape\n */\n          __pyx_v_start = (__pyx_v_shape - 1);\n\n          /* \"View.MemoryView\":847\n *                     start = 0\n *             elif start >= shape:\n *                 if negative_step:             # <<<<<<<<<<<<<<\n *                     start = shape - 1\n *                 else:\n */\n          goto __pyx_L14;\n        }\n\n        /* \"View.MemoryView\":850\n *                     start = shape - 1\n *                 else:\n *                     start = shape             # <<<<<<<<<<<<<<\n *         else:\n *             if negative_step:\n */\n        /*else*/ {\n          __pyx_v_start = __pyx_v_shape;\n        }\n        __pyx_L14:;\n\n        /* \"View.MemoryView\":846\n *                 if start < 0:\n *                     start = 0\n *             elif start >= shape:             # <<<<<<<<<<<<<<\n *                 if negative_step:\n *                     start = shape - 1\n */\n      }\n      __pyx_L12:;\n\n      /* \"View.MemoryView\":841\n * \n * \n *         if have_start:             # <<<<<<<<<<<<<<\n *             if start < 0:\n *                 start += shape\n */\n      goto __pyx_L11;\n    }\n\n    /* \"View.MemoryView\":852\n *                     start = shape\n *         else:\n *             if negative_step:             # <<<<<<<<<<<<<<\n *                 start = shape - 1\n *             else:\n */\n    /*else*/ {\n      __pyx_t_2 = (__pyx_v_negative_step != 0);\n      if (__pyx_t_2) {\n\n        /* \"View.MemoryView\":853\n *         else:\n *             if negative_step:\n *                 start = shape - 1             # <<<<<<<<<<<<<<\n *             else:\n *                 start = 0\n */\n        __pyx_v_start = (__pyx_v_shape - 1);\n\n        /* \"View.MemoryView\":852\n *                     start = shape\n *         else:\n *             if negative_step:             # <<<<<<<<<<<<<<\n *                 start = shape - 1\n *             else:\n */\n        goto __pyx_L15;\n      }\n\n      /* \"View.MemoryView\":855\n *                 start = shape - 1\n *             else:\n *                 start = 0             # <<<<<<<<<<<<<<\n * \n *         if have_stop:\n */\n      /*else*/ {\n        __pyx_v_start = 0;\n      }\n      __pyx_L15:;\n    }\n    __pyx_L11:;\n\n    /* \"View.MemoryView\":857\n *                 start = 0\n * \n *         if have_stop:             # <<<<<<<<<<<<<<\n *             if stop < 0:\n *                 stop += shape\n */\n    __pyx_t_2 = (__pyx_v_have_stop != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":858\n * \n *         if have_stop:\n *             if stop < 0:             # <<<<<<<<<<<<<<\n *                 stop += shape\n *                 if stop < 0:\n */\n      __pyx_t_2 = ((__pyx_v_stop < 0) != 0);\n      if (__pyx_t_2) {\n\n        /* \"View.MemoryView\":859\n *         if have_stop:\n *             if stop < 0:\n *                 stop += shape             # <<<<<<<<<<<<<<\n *                 if stop < 0:\n *                     stop = 0\n */\n        __pyx_v_stop = (__pyx_v_stop + __pyx_v_shape);\n\n        /* \"View.MemoryView\":860\n *             if stop < 0:\n *                 stop += shape\n *                 if stop < 0:             # <<<<<<<<<<<<<<\n *                     stop = 0\n *             elif stop > shape:\n */\n        __pyx_t_2 = ((__pyx_v_stop < 0) != 0);\n        if (__pyx_t_2) {\n\n          /* \"View.MemoryView\":861\n *                 stop += shape\n *                 if stop < 0:\n *                     stop = 0             # <<<<<<<<<<<<<<\n *             elif stop > shape:\n *                 stop = shape\n */\n          __pyx_v_stop = 0;\n\n          /* \"View.MemoryView\":860\n *             if stop < 0:\n *                 stop += shape\n *                 if stop < 0:             # <<<<<<<<<<<<<<\n *                     stop = 0\n *             elif stop > shape:\n */\n        }\n\n        /* \"View.MemoryView\":858\n * \n *         if have_stop:\n *             if stop < 0:             # <<<<<<<<<<<<<<\n *                 stop += shape\n *                 if stop < 0:\n */\n        goto __pyx_L17;\n      }\n\n      /* \"View.MemoryView\":862\n *                 if stop < 0:\n *                     stop = 0\n *             elif stop > shape:             # <<<<<<<<<<<<<<\n *                 stop = shape\n *         else:\n */\n      __pyx_t_2 = ((__pyx_v_stop > __pyx_v_shape) != 0);\n      if (__pyx_t_2) {\n\n        /* \"View.MemoryView\":863\n *                     stop = 0\n *             elif stop > shape:\n *                 stop = shape             # <<<<<<<<<<<<<<\n *         else:\n *             if negative_step:\n */\n        __pyx_v_stop = __pyx_v_shape;\n\n        /* \"View.MemoryView\":862\n *                 if stop < 0:\n *                     stop = 0\n *             elif stop > shape:             # <<<<<<<<<<<<<<\n *                 stop = shape\n *         else:\n */\n      }\n      __pyx_L17:;\n\n      /* \"View.MemoryView\":857\n *                 start = 0\n * \n *         if have_stop:             # <<<<<<<<<<<<<<\n *             if stop < 0:\n *                 stop += shape\n */\n      goto __pyx_L16;\n    }\n\n    /* \"View.MemoryView\":865\n *                 stop = shape\n *         else:\n *             if negative_step:             # <<<<<<<<<<<<<<\n *                 stop = -1\n *             else:\n */\n    /*else*/ {\n      __pyx_t_2 = (__pyx_v_negative_step != 0);\n      if (__pyx_t_2) {\n\n        /* \"View.MemoryView\":866\n *         else:\n *             if negative_step:\n *                 stop = -1             # <<<<<<<<<<<<<<\n *             else:\n *                 stop = shape\n */\n        __pyx_v_stop = -1L;\n\n        /* \"View.MemoryView\":865\n *                 stop = shape\n *         else:\n *             if negative_step:             # <<<<<<<<<<<<<<\n *                 stop = -1\n *             else:\n */\n        goto __pyx_L19;\n      }\n\n      /* \"View.MemoryView\":868\n *                 stop = -1\n *             else:\n *                 stop = shape             # <<<<<<<<<<<<<<\n * \n *         if not have_step:\n */\n      /*else*/ {\n        __pyx_v_stop = __pyx_v_shape;\n      }\n      __pyx_L19:;\n    }\n    __pyx_L16:;\n\n    /* \"View.MemoryView\":870\n *                 stop = shape\n * \n *         if not have_step:             # <<<<<<<<<<<<<<\n *             step = 1\n * \n */\n    __pyx_t_2 = ((!(__pyx_v_have_step != 0)) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":871\n * \n *         if not have_step:\n *             step = 1             # <<<<<<<<<<<<<<\n * \n * \n */\n      __pyx_v_step = 1;\n\n      /* \"View.MemoryView\":870\n *                 stop = shape\n * \n *         if not have_step:             # <<<<<<<<<<<<<<\n *             step = 1\n * \n */\n    }\n\n    /* \"View.MemoryView\":875\n * \n *         with cython.cdivision(True):\n *             new_shape = (stop - start) // step             # <<<<<<<<<<<<<<\n * \n *             if (stop - start) - step * new_shape:\n */\n    __pyx_v_new_shape = ((__pyx_v_stop - __pyx_v_start) / __pyx_v_step);\n\n    /* \"View.MemoryView\":877\n *             new_shape = (stop - start) // step\n * \n *             if (stop - start) - step * new_shape:             # <<<<<<<<<<<<<<\n *                 new_shape += 1\n * \n */\n    __pyx_t_2 = (((__pyx_v_stop - __pyx_v_start) - (__pyx_v_step * __pyx_v_new_shape)) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":878\n * \n *             if (stop - start) - step * new_shape:\n *                 new_shape += 1             # <<<<<<<<<<<<<<\n * \n *         if new_shape < 0:\n */\n      __pyx_v_new_shape = (__pyx_v_new_shape + 1);\n\n      /* \"View.MemoryView\":877\n *             new_shape = (stop - start) // step\n * \n *             if (stop - start) - step * new_shape:             # <<<<<<<<<<<<<<\n *                 new_shape += 1\n * \n */\n    }\n\n    /* \"View.MemoryView\":880\n *                 new_shape += 1\n * \n *         if new_shape < 0:             # <<<<<<<<<<<<<<\n *             new_shape = 0\n * \n */\n    __pyx_t_2 = ((__pyx_v_new_shape < 0) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":881\n * \n *         if new_shape < 0:\n *             new_shape = 0             # <<<<<<<<<<<<<<\n * \n * \n */\n      __pyx_v_new_shape = 0;\n\n      /* \"View.MemoryView\":880\n *                 new_shape += 1\n * \n *         if new_shape < 0:             # <<<<<<<<<<<<<<\n *             new_shape = 0\n * \n */\n    }\n\n    /* \"View.MemoryView\":884\n * \n * \n *         dst.strides[new_ndim] = stride * step             # <<<<<<<<<<<<<<\n *         dst.shape[new_ndim] = new_shape\n *         dst.suboffsets[new_ndim] = suboffset\n */\n    (__pyx_v_dst->strides[__pyx_v_new_ndim]) = (__pyx_v_stride * __pyx_v_step);\n\n    /* \"View.MemoryView\":885\n * \n *         dst.strides[new_ndim] = stride * step\n *         dst.shape[new_ndim] = new_shape             # <<<<<<<<<<<<<<\n *         dst.suboffsets[new_ndim] = suboffset\n * \n */\n    (__pyx_v_dst->shape[__pyx_v_new_ndim]) = __pyx_v_new_shape;\n\n    /* \"View.MemoryView\":886\n *         dst.strides[new_ndim] = stride * step\n *         dst.shape[new_ndim] = new_shape\n *         dst.suboffsets[new_ndim] = suboffset             # <<<<<<<<<<<<<<\n * \n * \n */\n    (__pyx_v_dst->suboffsets[__pyx_v_new_ndim]) = __pyx_v_suboffset;\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":889\n * \n * \n *     if suboffset_dim[0] < 0:             # <<<<<<<<<<<<<<\n *         dst.data += start * stride\n *     else:\n */\n  __pyx_t_2 = (((__pyx_v_suboffset_dim[0]) < 0) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":890\n * \n *     if suboffset_dim[0] < 0:\n *         dst.data += start * stride             # <<<<<<<<<<<<<<\n *     else:\n *         dst.suboffsets[suboffset_dim[0]] += start * stride\n */\n    __pyx_v_dst->data = (__pyx_v_dst->data + (__pyx_v_start * __pyx_v_stride));\n\n    /* \"View.MemoryView\":889\n * \n * \n *     if suboffset_dim[0] < 0:             # <<<<<<<<<<<<<<\n *         dst.data += start * stride\n *     else:\n */\n    goto __pyx_L23;\n  }\n\n  /* \"View.MemoryView\":892\n *         dst.data += start * stride\n *     else:\n *         dst.suboffsets[suboffset_dim[0]] += start * stride             # <<<<<<<<<<<<<<\n * \n *     if suboffset >= 0:\n */\n  /*else*/ {\n    __pyx_t_3 = (__pyx_v_suboffset_dim[0]);\n    (__pyx_v_dst->suboffsets[__pyx_t_3]) = ((__pyx_v_dst->suboffsets[__pyx_t_3]) + (__pyx_v_start * __pyx_v_stride));\n  }\n  __pyx_L23:;\n\n  /* \"View.MemoryView\":894\n *         dst.suboffsets[suboffset_dim[0]] += start * stride\n * \n *     if suboffset >= 0:             # <<<<<<<<<<<<<<\n *         if not is_slice:\n *             if new_ndim == 0:\n */\n  __pyx_t_2 = ((__pyx_v_suboffset >= 0) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":895\n * \n *     if suboffset >= 0:\n *         if not is_slice:             # <<<<<<<<<<<<<<\n *             if new_ndim == 0:\n *                 dst.data = (<char **> dst.data)[0] + suboffset\n */\n    __pyx_t_2 = ((!(__pyx_v_is_slice != 0)) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":896\n *     if suboffset >= 0:\n *         if not is_slice:\n *             if new_ndim == 0:             # <<<<<<<<<<<<<<\n *                 dst.data = (<char **> dst.data)[0] + suboffset\n *             else:\n */\n      __pyx_t_2 = ((__pyx_v_new_ndim == 0) != 0);\n      if (__pyx_t_2) {\n\n        /* \"View.MemoryView\":897\n *         if not is_slice:\n *             if new_ndim == 0:\n *                 dst.data = (<char **> dst.data)[0] + suboffset             # <<<<<<<<<<<<<<\n *             else:\n *                 _err_dim(IndexError, \"All dimensions preceding dimension %d \"\n */\n        __pyx_v_dst->data = ((((char **)__pyx_v_dst->data)[0]) + __pyx_v_suboffset);\n\n        /* \"View.MemoryView\":896\n *     if suboffset >= 0:\n *         if not is_slice:\n *             if new_ndim == 0:             # <<<<<<<<<<<<<<\n *                 dst.data = (<char **> dst.data)[0] + suboffset\n *             else:\n */\n        goto __pyx_L26;\n      }\n\n      /* \"View.MemoryView\":899\n *                 dst.data = (<char **> dst.data)[0] + suboffset\n *             else:\n *                 _err_dim(IndexError, \"All dimensions preceding dimension %d \"             # <<<<<<<<<<<<<<\n *                                      \"must be indexed and not sliced\", dim)\n *         else:\n */\n      /*else*/ {\n\n        /* \"View.MemoryView\":900\n *             else:\n *                 _err_dim(IndexError, \"All dimensions preceding dimension %d \"\n *                                      \"must be indexed and not sliced\", dim)             # <<<<<<<<<<<<<<\n *         else:\n *             suboffset_dim[0] = new_ndim\n */\n        __pyx_t_3 = __pyx_memoryview_err_dim(__pyx_builtin_IndexError, ((char *)\"All dimensions preceding dimension %d must be indexed and not sliced\"), __pyx_v_dim); if (unlikely(__pyx_t_3 == ((int)-1))) __PYX_ERR(2, 899, __pyx_L1_error)\n      }\n      __pyx_L26:;\n\n      /* \"View.MemoryView\":895\n * \n *     if suboffset >= 0:\n *         if not is_slice:             # <<<<<<<<<<<<<<\n *             if new_ndim == 0:\n *                 dst.data = (<char **> dst.data)[0] + suboffset\n */\n      goto __pyx_L25;\n    }\n\n    /* \"View.MemoryView\":902\n *                                      \"must be indexed and not sliced\", dim)\n *         else:\n *             suboffset_dim[0] = new_ndim             # <<<<<<<<<<<<<<\n * \n *     return 0\n */\n    /*else*/ {\n      (__pyx_v_suboffset_dim[0]) = __pyx_v_new_ndim;\n    }\n    __pyx_L25:;\n\n    /* \"View.MemoryView\":894\n *         dst.suboffsets[suboffset_dim[0]] += start * stride\n * \n *     if suboffset >= 0:             # <<<<<<<<<<<<<<\n *         if not is_slice:\n *             if new_ndim == 0:\n */\n  }\n\n  /* \"View.MemoryView\":904\n *             suboffset_dim[0] = new_ndim\n * \n *     return 0             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_r = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":807\n * \n * @cname('__pyx_memoryview_slice_memviewslice')\n * cdef int slice_memviewslice(             # <<<<<<<<<<<<<<\n *         __Pyx_memviewslice *dst,\n *         Py_ssize_t shape, Py_ssize_t stride, Py_ssize_t suboffset,\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  {\n    #ifdef WITH_THREAD\n    PyGILState_STATE __pyx_gilstate_save = __Pyx_PyGILState_Ensure();\n    #endif\n    __Pyx_AddTraceback(\"View.MemoryView.slice_memviewslice\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n    #ifdef WITH_THREAD\n    __Pyx_PyGILState_Release(__pyx_gilstate_save);\n    #endif\n  }\n  __pyx_r = -1;\n  __pyx_L0:;\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":910\n * \n * @cname('__pyx_pybuffer_index')\n * cdef char *pybuffer_index(Py_buffer *view, char *bufp, Py_ssize_t index,             # <<<<<<<<<<<<<<\n *                           Py_ssize_t dim) except NULL:\n *     cdef Py_ssize_t shape, stride, suboffset = -1\n */\n\nstatic char *__pyx_pybuffer_index(Py_buffer *__pyx_v_view, char *__pyx_v_bufp, Py_ssize_t __pyx_v_index, Py_ssize_t __pyx_v_dim) {\n  Py_ssize_t __pyx_v_shape;\n  Py_ssize_t __pyx_v_stride;\n  Py_ssize_t __pyx_v_suboffset;\n  Py_ssize_t __pyx_v_itemsize;\n  char *__pyx_v_resultp;\n  char *__pyx_r;\n  __Pyx_RefNannyDeclarations\n  Py_ssize_t __pyx_t_1;\n  int __pyx_t_2;\n  PyObject *__pyx_t_3 = NULL;\n  PyObject *__pyx_t_4 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"pybuffer_index\", 0);\n\n  /* \"View.MemoryView\":912\n * cdef char *pybuffer_index(Py_buffer *view, char *bufp, Py_ssize_t index,\n *                           Py_ssize_t dim) except NULL:\n *     cdef Py_ssize_t shape, stride, suboffset = -1             # <<<<<<<<<<<<<<\n *     cdef Py_ssize_t itemsize = view.itemsize\n *     cdef char *resultp\n */\n  __pyx_v_suboffset = -1L;\n\n  /* \"View.MemoryView\":913\n *                           Py_ssize_t dim) except NULL:\n *     cdef Py_ssize_t shape, stride, suboffset = -1\n *     cdef Py_ssize_t itemsize = view.itemsize             # <<<<<<<<<<<<<<\n *     cdef char *resultp\n * \n */\n  __pyx_t_1 = __pyx_v_view->itemsize;\n  __pyx_v_itemsize = __pyx_t_1;\n\n  /* \"View.MemoryView\":916\n *     cdef char *resultp\n * \n *     if view.ndim == 0:             # <<<<<<<<<<<<<<\n *         shape = view.len / itemsize\n *         stride = itemsize\n */\n  __pyx_t_2 = ((__pyx_v_view->ndim == 0) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":917\n * \n *     if view.ndim == 0:\n *         shape = view.len / itemsize             # <<<<<<<<<<<<<<\n *         stride = itemsize\n *     else:\n */\n    if (unlikely(__pyx_v_itemsize == 0)) {\n      PyErr_SetString(PyExc_ZeroDivisionError, \"integer division or modulo by zero\");\n      __PYX_ERR(2, 917, __pyx_L1_error)\n    }\n    else if (sizeof(Py_ssize_t) == sizeof(long) && (!(((Py_ssize_t)-1) > 0)) && unlikely(__pyx_v_itemsize == (Py_ssize_t)-1)  && unlikely(UNARY_NEG_WOULD_OVERFLOW(__pyx_v_view->len))) {\n      PyErr_SetString(PyExc_OverflowError, \"value too large to perform division\");\n      __PYX_ERR(2, 917, __pyx_L1_error)\n    }\n    __pyx_v_shape = __Pyx_div_Py_ssize_t(__pyx_v_view->len, __pyx_v_itemsize);\n\n    /* \"View.MemoryView\":918\n *     if view.ndim == 0:\n *         shape = view.len / itemsize\n *         stride = itemsize             # <<<<<<<<<<<<<<\n *     else:\n *         shape = view.shape[dim]\n */\n    __pyx_v_stride = __pyx_v_itemsize;\n\n    /* \"View.MemoryView\":916\n *     cdef char *resultp\n * \n *     if view.ndim == 0:             # <<<<<<<<<<<<<<\n *         shape = view.len / itemsize\n *         stride = itemsize\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":920\n *         stride = itemsize\n *     else:\n *         shape = view.shape[dim]             # <<<<<<<<<<<<<<\n *         stride = view.strides[dim]\n *         if view.suboffsets != NULL:\n */\n  /*else*/ {\n    __pyx_v_shape = (__pyx_v_view->shape[__pyx_v_dim]);\n\n    /* \"View.MemoryView\":921\n *     else:\n *         shape = view.shape[dim]\n *         stride = view.strides[dim]             # <<<<<<<<<<<<<<\n *         if view.suboffsets != NULL:\n *             suboffset = view.suboffsets[dim]\n */\n    __pyx_v_stride = (__pyx_v_view->strides[__pyx_v_dim]);\n\n    /* \"View.MemoryView\":922\n *         shape = view.shape[dim]\n *         stride = view.strides[dim]\n *         if view.suboffsets != NULL:             # <<<<<<<<<<<<<<\n *             suboffset = view.suboffsets[dim]\n * \n */\n    __pyx_t_2 = ((__pyx_v_view->suboffsets != NULL) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":923\n *         stride = view.strides[dim]\n *         if view.suboffsets != NULL:\n *             suboffset = view.suboffsets[dim]             # <<<<<<<<<<<<<<\n * \n *     if index < 0:\n */\n      __pyx_v_suboffset = (__pyx_v_view->suboffsets[__pyx_v_dim]);\n\n      /* \"View.MemoryView\":922\n *         shape = view.shape[dim]\n *         stride = view.strides[dim]\n *         if view.suboffsets != NULL:             # <<<<<<<<<<<<<<\n *             suboffset = view.suboffsets[dim]\n * \n */\n    }\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":925\n *             suboffset = view.suboffsets[dim]\n * \n *     if index < 0:             # <<<<<<<<<<<<<<\n *         index += view.shape[dim]\n *         if index < 0:\n */\n  __pyx_t_2 = ((__pyx_v_index < 0) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":926\n * \n *     if index < 0:\n *         index += view.shape[dim]             # <<<<<<<<<<<<<<\n *         if index < 0:\n *             raise IndexError(\"Out of bounds on buffer access (axis %d)\" % dim)\n */\n    __pyx_v_index = (__pyx_v_index + (__pyx_v_view->shape[__pyx_v_dim]));\n\n    /* \"View.MemoryView\":927\n *     if index < 0:\n *         index += view.shape[dim]\n *         if index < 0:             # <<<<<<<<<<<<<<\n *             raise IndexError(\"Out of bounds on buffer access (axis %d)\" % dim)\n * \n */\n    __pyx_t_2 = ((__pyx_v_index < 0) != 0);\n    if (unlikely(__pyx_t_2)) {\n\n      /* \"View.MemoryView\":928\n *         index += view.shape[dim]\n *         if index < 0:\n *             raise IndexError(\"Out of bounds on buffer access (axis %d)\" % dim)             # <<<<<<<<<<<<<<\n * \n *     if index >= shape:\n */\n      __pyx_t_3 = PyInt_FromSsize_t(__pyx_v_dim); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 928, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_3);\n      __pyx_t_4 = __Pyx_PyString_Format(__pyx_kp_s_Out_of_bounds_on_buffer_access_a, __pyx_t_3); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 928, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_4);\n      __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n      __pyx_t_3 = __Pyx_PyObject_CallOneArg(__pyx_builtin_IndexError, __pyx_t_4); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 928, __pyx_L1_error)\n      __Pyx_GOTREF(__pyx_t_3);\n      __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;\n      __Pyx_Raise(__pyx_t_3, 0, 0, 0);\n      __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n      __PYX_ERR(2, 928, __pyx_L1_error)\n\n      /* \"View.MemoryView\":927\n *     if index < 0:\n *         index += view.shape[dim]\n *         if index < 0:             # <<<<<<<<<<<<<<\n *             raise IndexError(\"Out of bounds on buffer access (axis %d)\" % dim)\n * \n */\n    }\n\n    /* \"View.MemoryView\":925\n *             suboffset = view.suboffsets[dim]\n * \n *     if index < 0:             # <<<<<<<<<<<<<<\n *         index += view.shape[dim]\n *         if index < 0:\n */\n  }\n\n  /* \"View.MemoryView\":930\n *             raise IndexError(\"Out of bounds on buffer access (axis %d)\" % dim)\n * \n *     if index >= shape:             # <<<<<<<<<<<<<<\n *         raise IndexError(\"Out of bounds on buffer access (axis %d)\" % dim)\n * \n */\n  __pyx_t_2 = ((__pyx_v_index >= __pyx_v_shape) != 0);\n  if (unlikely(__pyx_t_2)) {\n\n    /* \"View.MemoryView\":931\n * \n *     if index >= shape:\n *         raise IndexError(\"Out of bounds on buffer access (axis %d)\" % dim)             # <<<<<<<<<<<<<<\n * \n *     resultp = bufp + index * stride\n */\n    __pyx_t_3 = PyInt_FromSsize_t(__pyx_v_dim); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 931, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __pyx_t_4 = __Pyx_PyString_Format(__pyx_kp_s_Out_of_bounds_on_buffer_access_a, __pyx_t_3); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 931, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n    __pyx_t_3 = __Pyx_PyObject_CallOneArg(__pyx_builtin_IndexError, __pyx_t_4); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 931, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;\n    __Pyx_Raise(__pyx_t_3, 0, 0, 0);\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n    __PYX_ERR(2, 931, __pyx_L1_error)\n\n    /* \"View.MemoryView\":930\n *             raise IndexError(\"Out of bounds on buffer access (axis %d)\" % dim)\n * \n *     if index >= shape:             # <<<<<<<<<<<<<<\n *         raise IndexError(\"Out of bounds on buffer access (axis %d)\" % dim)\n * \n */\n  }\n\n  /* \"View.MemoryView\":933\n *         raise IndexError(\"Out of bounds on buffer access (axis %d)\" % dim)\n * \n *     resultp = bufp + index * stride             # <<<<<<<<<<<<<<\n *     if suboffset >= 0:\n *         resultp = (<char **> resultp)[0] + suboffset\n */\n  __pyx_v_resultp = (__pyx_v_bufp + (__pyx_v_index * __pyx_v_stride));\n\n  /* \"View.MemoryView\":934\n * \n *     resultp = bufp + index * stride\n *     if suboffset >= 0:             # <<<<<<<<<<<<<<\n *         resultp = (<char **> resultp)[0] + suboffset\n * \n */\n  __pyx_t_2 = ((__pyx_v_suboffset >= 0) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":935\n *     resultp = bufp + index * stride\n *     if suboffset >= 0:\n *         resultp = (<char **> resultp)[0] + suboffset             # <<<<<<<<<<<<<<\n * \n *     return resultp\n */\n    __pyx_v_resultp = ((((char **)__pyx_v_resultp)[0]) + __pyx_v_suboffset);\n\n    /* \"View.MemoryView\":934\n * \n *     resultp = bufp + index * stride\n *     if suboffset >= 0:             # <<<<<<<<<<<<<<\n *         resultp = (<char **> resultp)[0] + suboffset\n * \n */\n  }\n\n  /* \"View.MemoryView\":937\n *         resultp = (<char **> resultp)[0] + suboffset\n * \n *     return resultp             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_r = __pyx_v_resultp;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":910\n * \n * @cname('__pyx_pybuffer_index')\n * cdef char *pybuffer_index(Py_buffer *view, char *bufp, Py_ssize_t index,             # <<<<<<<<<<<<<<\n *                           Py_ssize_t dim) except NULL:\n *     cdef Py_ssize_t shape, stride, suboffset = -1\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_XDECREF(__pyx_t_4);\n  __Pyx_AddTraceback(\"View.MemoryView.pybuffer_index\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":943\n * \n * @cname('__pyx_memslice_transpose')\n * cdef int transpose_memslice(__Pyx_memviewslice *memslice) nogil except 0:             # <<<<<<<<<<<<<<\n *     cdef int ndim = memslice.memview.view.ndim\n * \n */\n\nstatic int __pyx_memslice_transpose(__Pyx_memviewslice *__pyx_v_memslice) {\n  int __pyx_v_ndim;\n  Py_ssize_t *__pyx_v_shape;\n  Py_ssize_t *__pyx_v_strides;\n  int __pyx_v_i;\n  int __pyx_v_j;\n  int __pyx_r;\n  int __pyx_t_1;\n  Py_ssize_t *__pyx_t_2;\n  long __pyx_t_3;\n  long __pyx_t_4;\n  Py_ssize_t __pyx_t_5;\n  Py_ssize_t __pyx_t_6;\n  int __pyx_t_7;\n  int __pyx_t_8;\n  int __pyx_t_9;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n\n  /* \"View.MemoryView\":944\n * @cname('__pyx_memslice_transpose')\n * cdef int transpose_memslice(__Pyx_memviewslice *memslice) nogil except 0:\n *     cdef int ndim = memslice.memview.view.ndim             # <<<<<<<<<<<<<<\n * \n *     cdef Py_ssize_t *shape = memslice.shape\n */\n  __pyx_t_1 = __pyx_v_memslice->memview->view.ndim;\n  __pyx_v_ndim = __pyx_t_1;\n\n  /* \"View.MemoryView\":946\n *     cdef int ndim = memslice.memview.view.ndim\n * \n *     cdef Py_ssize_t *shape = memslice.shape             # <<<<<<<<<<<<<<\n *     cdef Py_ssize_t *strides = memslice.strides\n * \n */\n  __pyx_t_2 = __pyx_v_memslice->shape;\n  __pyx_v_shape = __pyx_t_2;\n\n  /* \"View.MemoryView\":947\n * \n *     cdef Py_ssize_t *shape = memslice.shape\n *     cdef Py_ssize_t *strides = memslice.strides             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_t_2 = __pyx_v_memslice->strides;\n  __pyx_v_strides = __pyx_t_2;\n\n  /* \"View.MemoryView\":951\n * \n *     cdef int i, j\n *     for i in range(ndim / 2):             # <<<<<<<<<<<<<<\n *         j = ndim - 1 - i\n *         strides[i], strides[j] = strides[j], strides[i]\n */\n  __pyx_t_3 = __Pyx_div_long(__pyx_v_ndim, 2);\n  __pyx_t_4 = __pyx_t_3;\n  for (__pyx_t_1 = 0; __pyx_t_1 < __pyx_t_4; __pyx_t_1+=1) {\n    __pyx_v_i = __pyx_t_1;\n\n    /* \"View.MemoryView\":952\n *     cdef int i, j\n *     for i in range(ndim / 2):\n *         j = ndim - 1 - i             # <<<<<<<<<<<<<<\n *         strides[i], strides[j] = strides[j], strides[i]\n *         shape[i], shape[j] = shape[j], shape[i]\n */\n    __pyx_v_j = ((__pyx_v_ndim - 1) - __pyx_v_i);\n\n    /* \"View.MemoryView\":953\n *     for i in range(ndim / 2):\n *         j = ndim - 1 - i\n *         strides[i], strides[j] = strides[j], strides[i]             # <<<<<<<<<<<<<<\n *         shape[i], shape[j] = shape[j], shape[i]\n * \n */\n    __pyx_t_5 = (__pyx_v_strides[__pyx_v_j]);\n    __pyx_t_6 = (__pyx_v_strides[__pyx_v_i]);\n    (__pyx_v_strides[__pyx_v_i]) = __pyx_t_5;\n    (__pyx_v_strides[__pyx_v_j]) = __pyx_t_6;\n\n    /* \"View.MemoryView\":954\n *         j = ndim - 1 - i\n *         strides[i], strides[j] = strides[j], strides[i]\n *         shape[i], shape[j] = shape[j], shape[i]             # <<<<<<<<<<<<<<\n * \n *         if memslice.suboffsets[i] >= 0 or memslice.suboffsets[j] >= 0:\n */\n    __pyx_t_6 = (__pyx_v_shape[__pyx_v_j]);\n    __pyx_t_5 = (__pyx_v_shape[__pyx_v_i]);\n    (__pyx_v_shape[__pyx_v_i]) = __pyx_t_6;\n    (__pyx_v_shape[__pyx_v_j]) = __pyx_t_5;\n\n    /* \"View.MemoryView\":956\n *         shape[i], shape[j] = shape[j], shape[i]\n * \n *         if memslice.suboffsets[i] >= 0 or memslice.suboffsets[j] >= 0:             # <<<<<<<<<<<<<<\n *             _err(ValueError, \"Cannot transpose memoryview with indirect dimensions\")\n * \n */\n    __pyx_t_8 = (((__pyx_v_memslice->suboffsets[__pyx_v_i]) >= 0) != 0);\n    if (!__pyx_t_8) {\n    } else {\n      __pyx_t_7 = __pyx_t_8;\n      goto __pyx_L6_bool_binop_done;\n    }\n    __pyx_t_8 = (((__pyx_v_memslice->suboffsets[__pyx_v_j]) >= 0) != 0);\n    __pyx_t_7 = __pyx_t_8;\n    __pyx_L6_bool_binop_done:;\n    if (__pyx_t_7) {\n\n      /* \"View.MemoryView\":957\n * \n *         if memslice.suboffsets[i] >= 0 or memslice.suboffsets[j] >= 0:\n *             _err(ValueError, \"Cannot transpose memoryview with indirect dimensions\")             # <<<<<<<<<<<<<<\n * \n *     return 1\n */\n      __pyx_t_9 = __pyx_memoryview_err(__pyx_builtin_ValueError, ((char *)\"Cannot transpose memoryview with indirect dimensions\")); if (unlikely(__pyx_t_9 == ((int)-1))) __PYX_ERR(2, 957, __pyx_L1_error)\n\n      /* \"View.MemoryView\":956\n *         shape[i], shape[j] = shape[j], shape[i]\n * \n *         if memslice.suboffsets[i] >= 0 or memslice.suboffsets[j] >= 0:             # <<<<<<<<<<<<<<\n *             _err(ValueError, \"Cannot transpose memoryview with indirect dimensions\")\n * \n */\n    }\n  }\n\n  /* \"View.MemoryView\":959\n *             _err(ValueError, \"Cannot transpose memoryview with indirect dimensions\")\n * \n *     return 1             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_r = 1;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":943\n * \n * @cname('__pyx_memslice_transpose')\n * cdef int transpose_memslice(__Pyx_memviewslice *memslice) nogil except 0:             # <<<<<<<<<<<<<<\n *     cdef int ndim = memslice.memview.view.ndim\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  {\n    #ifdef WITH_THREAD\n    PyGILState_STATE __pyx_gilstate_save = __Pyx_PyGILState_Ensure();\n    #endif\n    __Pyx_AddTraceback(\"View.MemoryView.transpose_memslice\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n    #ifdef WITH_THREAD\n    __Pyx_PyGILState_Release(__pyx_gilstate_save);\n    #endif\n  }\n  __pyx_r = 0;\n  __pyx_L0:;\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":976\n *     cdef int (*to_dtype_func)(char *, object) except 0\n * \n *     def __dealloc__(self):             # <<<<<<<<<<<<<<\n *         __PYX_XDEC_MEMVIEW(&self.from_slice, 1)\n * \n */\n\n/* Python wrapper */\nstatic void __pyx_memoryviewslice___dealloc__(PyObject *__pyx_v_self); /*proto*/\nstatic void __pyx_memoryviewslice___dealloc__(PyObject *__pyx_v_self) {\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__dealloc__ (wrapper)\", 0);\n  __pyx_memoryviewslice___pyx_pf_15View_dot_MemoryView_16_memoryviewslice___dealloc__(((struct __pyx_memoryviewslice_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n}\n\nstatic void __pyx_memoryviewslice___pyx_pf_15View_dot_MemoryView_16_memoryviewslice___dealloc__(struct __pyx_memoryviewslice_obj *__pyx_v_self) {\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__dealloc__\", 0);\n\n  /* \"View.MemoryView\":977\n * \n *     def __dealloc__(self):\n *         __PYX_XDEC_MEMVIEW(&self.from_slice, 1)             # <<<<<<<<<<<<<<\n * \n *     cdef convert_item_to_object(self, char *itemp):\n */\n  __PYX_XDEC_MEMVIEW((&__pyx_v_self->from_slice), 1);\n\n  /* \"View.MemoryView\":976\n *     cdef int (*to_dtype_func)(char *, object) except 0\n * \n *     def __dealloc__(self):             # <<<<<<<<<<<<<<\n *         __PYX_XDEC_MEMVIEW(&self.from_slice, 1)\n * \n */\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n}\n\n/* \"View.MemoryView\":979\n *         __PYX_XDEC_MEMVIEW(&self.from_slice, 1)\n * \n *     cdef convert_item_to_object(self, char *itemp):             # <<<<<<<<<<<<<<\n *         if self.to_object_func != NULL:\n *             return self.to_object_func(itemp)\n */\n\nstatic PyObject *__pyx_memoryviewslice_convert_item_to_object(struct __pyx_memoryviewslice_obj *__pyx_v_self, char *__pyx_v_itemp) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"convert_item_to_object\", 0);\n\n  /* \"View.MemoryView\":980\n * \n *     cdef convert_item_to_object(self, char *itemp):\n *         if self.to_object_func != NULL:             # <<<<<<<<<<<<<<\n *             return self.to_object_func(itemp)\n *         else:\n */\n  __pyx_t_1 = ((__pyx_v_self->to_object_func != NULL) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":981\n *     cdef convert_item_to_object(self, char *itemp):\n *         if self.to_object_func != NULL:\n *             return self.to_object_func(itemp)             # <<<<<<<<<<<<<<\n *         else:\n *             return memoryview.convert_item_to_object(self, itemp)\n */\n    __Pyx_XDECREF(__pyx_r);\n    __pyx_t_2 = __pyx_v_self->to_object_func(__pyx_v_itemp); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 981, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __pyx_r = __pyx_t_2;\n    __pyx_t_2 = 0;\n    goto __pyx_L0;\n\n    /* \"View.MemoryView\":980\n * \n *     cdef convert_item_to_object(self, char *itemp):\n *         if self.to_object_func != NULL:             # <<<<<<<<<<<<<<\n *             return self.to_object_func(itemp)\n *         else:\n */\n  }\n\n  /* \"View.MemoryView\":983\n *             return self.to_object_func(itemp)\n *         else:\n *             return memoryview.convert_item_to_object(self, itemp)             # <<<<<<<<<<<<<<\n * \n *     cdef assign_item_from_object(self, char *itemp, object value):\n */\n  /*else*/ {\n    __Pyx_XDECREF(__pyx_r);\n    __pyx_t_2 = __pyx_memoryview_convert_item_to_object(((struct __pyx_memoryview_obj *)__pyx_v_self), __pyx_v_itemp); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 983, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __pyx_r = __pyx_t_2;\n    __pyx_t_2 = 0;\n    goto __pyx_L0;\n  }\n\n  /* \"View.MemoryView\":979\n *         __PYX_XDEC_MEMVIEW(&self.from_slice, 1)\n * \n *     cdef convert_item_to_object(self, char *itemp):             # <<<<<<<<<<<<<<\n *         if self.to_object_func != NULL:\n *             return self.to_object_func(itemp)\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_AddTraceback(\"View.MemoryView._memoryviewslice.convert_item_to_object\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":985\n *             return memoryview.convert_item_to_object(self, itemp)\n * \n *     cdef assign_item_from_object(self, char *itemp, object value):             # <<<<<<<<<<<<<<\n *         if self.to_dtype_func != NULL:\n *             self.to_dtype_func(itemp, value)\n */\n\nstatic PyObject *__pyx_memoryviewslice_assign_item_from_object(struct __pyx_memoryviewslice_obj *__pyx_v_self, char *__pyx_v_itemp, PyObject *__pyx_v_value) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  int __pyx_t_2;\n  PyObject *__pyx_t_3 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"assign_item_from_object\", 0);\n\n  /* \"View.MemoryView\":986\n * \n *     cdef assign_item_from_object(self, char *itemp, object value):\n *         if self.to_dtype_func != NULL:             # <<<<<<<<<<<<<<\n *             self.to_dtype_func(itemp, value)\n *         else:\n */\n  __pyx_t_1 = ((__pyx_v_self->to_dtype_func != NULL) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":987\n *     cdef assign_item_from_object(self, char *itemp, object value):\n *         if self.to_dtype_func != NULL:\n *             self.to_dtype_func(itemp, value)             # <<<<<<<<<<<<<<\n *         else:\n *             memoryview.assign_item_from_object(self, itemp, value)\n */\n    __pyx_t_2 = __pyx_v_self->to_dtype_func(__pyx_v_itemp, __pyx_v_value); if (unlikely(__pyx_t_2 == ((int)0))) __PYX_ERR(2, 987, __pyx_L1_error)\n\n    /* \"View.MemoryView\":986\n * \n *     cdef assign_item_from_object(self, char *itemp, object value):\n *         if self.to_dtype_func != NULL:             # <<<<<<<<<<<<<<\n *             self.to_dtype_func(itemp, value)\n *         else:\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":989\n *             self.to_dtype_func(itemp, value)\n *         else:\n *             memoryview.assign_item_from_object(self, itemp, value)             # <<<<<<<<<<<<<<\n * \n *     @property\n */\n  /*else*/ {\n    __pyx_t_3 = __pyx_memoryview_assign_item_from_object(((struct __pyx_memoryview_obj *)__pyx_v_self), __pyx_v_itemp, __pyx_v_value); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 989, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":985\n *             return memoryview.convert_item_to_object(self, itemp)\n * \n *     cdef assign_item_from_object(self, char *itemp, object value):             # <<<<<<<<<<<<<<\n *         if self.to_dtype_func != NULL:\n *             self.to_dtype_func(itemp, value)\n */\n\n  /* function exit code */\n  __pyx_r = Py_None; __Pyx_INCREF(Py_None);\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_AddTraceback(\"View.MemoryView._memoryviewslice.assign_item_from_object\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":992\n * \n *     @property\n *     def base(self):             # <<<<<<<<<<<<<<\n *         return self.from_object\n * \n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_16_memoryviewslice_4base_1__get__(PyObject *__pyx_v_self); /*proto*/\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_16_memoryviewslice_4base_1__get__(PyObject *__pyx_v_self) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__get__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf_15View_dot_MemoryView_16_memoryviewslice_4base___get__(((struct __pyx_memoryviewslice_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf_15View_dot_MemoryView_16_memoryviewslice_4base___get__(struct __pyx_memoryviewslice_obj *__pyx_v_self) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__get__\", 0);\n\n  /* \"View.MemoryView\":993\n *     @property\n *     def base(self):\n *         return self.from_object             # <<<<<<<<<<<<<<\n * \n *     __pyx_getbuffer = capsule(<void *> &__pyx_memoryview_getbuffer, \"getbuffer(obj, view, flags)\")\n */\n  __Pyx_XDECREF(__pyx_r);\n  __Pyx_INCREF(__pyx_v_self->from_object);\n  __pyx_r = __pyx_v_self->from_object;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":992\n * \n *     @property\n *     def base(self):             # <<<<<<<<<<<<<<\n *         return self.from_object\n * \n */\n\n  /* function exit code */\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"(tree fragment)\":1\n * def __reduce_cython__(self):             # <<<<<<<<<<<<<<\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw___pyx_memoryviewslice_1__reduce_cython__(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused); /*proto*/\nstatic PyObject *__pyx_pw___pyx_memoryviewslice_1__reduce_cython__(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__reduce_cython__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf___pyx_memoryviewslice___reduce_cython__(((struct __pyx_memoryviewslice_obj *)__pyx_v_self));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf___pyx_memoryviewslice___reduce_cython__(CYTHON_UNUSED struct __pyx_memoryviewslice_obj *__pyx_v_self) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__reduce_cython__\", 0);\n\n  /* \"(tree fragment)\":2\n * def __reduce_cython__(self):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")             # <<<<<<<<<<<<<<\n * def __setstate_cython__(self, __pyx_state):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n */\n  __pyx_t_1 = __Pyx_PyObject_Call(__pyx_builtin_TypeError, __pyx_tuple__20, NULL); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 2, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_Raise(__pyx_t_1, 0, 0, 0);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __PYX_ERR(2, 2, __pyx_L1_error)\n\n  /* \"(tree fragment)\":1\n * def __reduce_cython__(self):             # <<<<<<<<<<<<<<\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"View.MemoryView._memoryviewslice.__reduce_cython__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"(tree fragment)\":3\n * def __reduce_cython__(self):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):             # <<<<<<<<<<<<<<\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw___pyx_memoryviewslice_3__setstate_cython__(PyObject *__pyx_v_self, PyObject *__pyx_v___pyx_state); /*proto*/\nstatic PyObject *__pyx_pw___pyx_memoryviewslice_3__setstate_cython__(PyObject *__pyx_v_self, PyObject *__pyx_v___pyx_state) {\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__setstate_cython__ (wrapper)\", 0);\n  __pyx_r = __pyx_pf___pyx_memoryviewslice_2__setstate_cython__(((struct __pyx_memoryviewslice_obj *)__pyx_v_self), ((PyObject *)__pyx_v___pyx_state));\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf___pyx_memoryviewslice_2__setstate_cython__(CYTHON_UNUSED struct __pyx_memoryviewslice_obj *__pyx_v_self, CYTHON_UNUSED PyObject *__pyx_v___pyx_state) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__setstate_cython__\", 0);\n\n  /* \"(tree fragment)\":4\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")             # <<<<<<<<<<<<<<\n */\n  __pyx_t_1 = __Pyx_PyObject_Call(__pyx_builtin_TypeError, __pyx_tuple__21, NULL); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 4, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_Raise(__pyx_t_1, 0, 0, 0);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __PYX_ERR(2, 4, __pyx_L1_error)\n\n  /* \"(tree fragment)\":3\n * def __reduce_cython__(self):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):             # <<<<<<<<<<<<<<\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"View.MemoryView._memoryviewslice.__setstate_cython__\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":999\n * \n * @cname('__pyx_memoryview_fromslice')\n * cdef memoryview_fromslice(__Pyx_memviewslice memviewslice,             # <<<<<<<<<<<<<<\n *                           int ndim,\n *                           object (*to_object_func)(char *),\n */\n\nstatic PyObject *__pyx_memoryview_fromslice(__Pyx_memviewslice __pyx_v_memviewslice, int __pyx_v_ndim, PyObject *(*__pyx_v_to_object_func)(char *), int (*__pyx_v_to_dtype_func)(char *, PyObject *), int __pyx_v_dtype_is_object) {\n  struct __pyx_memoryviewslice_obj *__pyx_v_result = 0;\n  Py_ssize_t __pyx_v_suboffset;\n  PyObject *__pyx_v_length = NULL;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  __Pyx_TypeInfo *__pyx_t_4;\n  Py_buffer __pyx_t_5;\n  Py_ssize_t *__pyx_t_6;\n  Py_ssize_t *__pyx_t_7;\n  Py_ssize_t *__pyx_t_8;\n  Py_ssize_t __pyx_t_9;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"memoryview_fromslice\", 0);\n\n  /* \"View.MemoryView\":1007\n *     cdef _memoryviewslice result\n * \n *     if <PyObject *> memviewslice.memview == Py_None:             # <<<<<<<<<<<<<<\n *         return None\n * \n */\n  __pyx_t_1 = ((((PyObject *)__pyx_v_memviewslice.memview) == Py_None) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":1008\n * \n *     if <PyObject *> memviewslice.memview == Py_None:\n *         return None             # <<<<<<<<<<<<<<\n * \n * \n */\n    __Pyx_XDECREF(__pyx_r);\n    __pyx_r = Py_None; __Pyx_INCREF(Py_None);\n    goto __pyx_L0;\n\n    /* \"View.MemoryView\":1007\n *     cdef _memoryviewslice result\n * \n *     if <PyObject *> memviewslice.memview == Py_None:             # <<<<<<<<<<<<<<\n *         return None\n * \n */\n  }\n\n  /* \"View.MemoryView\":1013\n * \n * \n *     result = _memoryviewslice(None, 0, dtype_is_object)             # <<<<<<<<<<<<<<\n * \n *     result.from_slice = memviewslice\n */\n  __pyx_t_2 = __Pyx_PyBool_FromLong(__pyx_v_dtype_is_object); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 1013, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __pyx_t_3 = PyTuple_New(3); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 1013, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  __Pyx_INCREF(Py_None);\n  __Pyx_GIVEREF(Py_None);\n  PyTuple_SET_ITEM(__pyx_t_3, 0, Py_None);\n  __Pyx_INCREF(__pyx_int_0);\n  __Pyx_GIVEREF(__pyx_int_0);\n  PyTuple_SET_ITEM(__pyx_t_3, 1, __pyx_int_0);\n  __Pyx_GIVEREF(__pyx_t_2);\n  PyTuple_SET_ITEM(__pyx_t_3, 2, __pyx_t_2);\n  __pyx_t_2 = 0;\n  __pyx_t_2 = __Pyx_PyObject_Call(((PyObject *)__pyx_memoryviewslice_type), __pyx_t_3, NULL); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 1013, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n  __pyx_v_result = ((struct __pyx_memoryviewslice_obj *)__pyx_t_2);\n  __pyx_t_2 = 0;\n\n  /* \"View.MemoryView\":1015\n *     result = _memoryviewslice(None, 0, dtype_is_object)\n * \n *     result.from_slice = memviewslice             # <<<<<<<<<<<<<<\n *     __PYX_INC_MEMVIEW(&memviewslice, 1)\n * \n */\n  __pyx_v_result->from_slice = __pyx_v_memviewslice;\n\n  /* \"View.MemoryView\":1016\n * \n *     result.from_slice = memviewslice\n *     __PYX_INC_MEMVIEW(&memviewslice, 1)             # <<<<<<<<<<<<<<\n * \n *     result.from_object = (<memoryview> memviewslice.memview).base\n */\n  __PYX_INC_MEMVIEW((&__pyx_v_memviewslice), 1);\n\n  /* \"View.MemoryView\":1018\n *     __PYX_INC_MEMVIEW(&memviewslice, 1)\n * \n *     result.from_object = (<memoryview> memviewslice.memview).base             # <<<<<<<<<<<<<<\n *     result.typeinfo = memviewslice.memview.typeinfo\n * \n */\n  __pyx_t_2 = __Pyx_PyObject_GetAttrStr(((PyObject *)__pyx_v_memviewslice.memview), __pyx_n_s_base); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 1018, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __Pyx_GIVEREF(__pyx_t_2);\n  __Pyx_GOTREF(__pyx_v_result->from_object);\n  __Pyx_DECREF(__pyx_v_result->from_object);\n  __pyx_v_result->from_object = __pyx_t_2;\n  __pyx_t_2 = 0;\n\n  /* \"View.MemoryView\":1019\n * \n *     result.from_object = (<memoryview> memviewslice.memview).base\n *     result.typeinfo = memviewslice.memview.typeinfo             # <<<<<<<<<<<<<<\n * \n *     result.view = memviewslice.memview.view\n */\n  __pyx_t_4 = __pyx_v_memviewslice.memview->typeinfo;\n  __pyx_v_result->__pyx_base.typeinfo = __pyx_t_4;\n\n  /* \"View.MemoryView\":1021\n *     result.typeinfo = memviewslice.memview.typeinfo\n * \n *     result.view = memviewslice.memview.view             # <<<<<<<<<<<<<<\n *     result.view.buf = <void *> memviewslice.data\n *     result.view.ndim = ndim\n */\n  __pyx_t_5 = __pyx_v_memviewslice.memview->view;\n  __pyx_v_result->__pyx_base.view = __pyx_t_5;\n\n  /* \"View.MemoryView\":1022\n * \n *     result.view = memviewslice.memview.view\n *     result.view.buf = <void *> memviewslice.data             # <<<<<<<<<<<<<<\n *     result.view.ndim = ndim\n *     (<__pyx_buffer *> &result.view).obj = Py_None\n */\n  __pyx_v_result->__pyx_base.view.buf = ((void *)__pyx_v_memviewslice.data);\n\n  /* \"View.MemoryView\":1023\n *     result.view = memviewslice.memview.view\n *     result.view.buf = <void *> memviewslice.data\n *     result.view.ndim = ndim             # <<<<<<<<<<<<<<\n *     (<__pyx_buffer *> &result.view).obj = Py_None\n *     Py_INCREF(Py_None)\n */\n  __pyx_v_result->__pyx_base.view.ndim = __pyx_v_ndim;\n\n  /* \"View.MemoryView\":1024\n *     result.view.buf = <void *> memviewslice.data\n *     result.view.ndim = ndim\n *     (<__pyx_buffer *> &result.view).obj = Py_None             # <<<<<<<<<<<<<<\n *     Py_INCREF(Py_None)\n * \n */\n  ((Py_buffer *)(&__pyx_v_result->__pyx_base.view))->obj = Py_None;\n\n  /* \"View.MemoryView\":1025\n *     result.view.ndim = ndim\n *     (<__pyx_buffer *> &result.view).obj = Py_None\n *     Py_INCREF(Py_None)             # <<<<<<<<<<<<<<\n * \n *     if (<memoryview>memviewslice.memview).flags & PyBUF_WRITABLE:\n */\n  Py_INCREF(Py_None);\n\n  /* \"View.MemoryView\":1027\n *     Py_INCREF(Py_None)\n * \n *     if (<memoryview>memviewslice.memview).flags & PyBUF_WRITABLE:             # <<<<<<<<<<<<<<\n *         result.flags = PyBUF_RECORDS\n *     else:\n */\n  __pyx_t_1 = ((((struct __pyx_memoryview_obj *)__pyx_v_memviewslice.memview)->flags & PyBUF_WRITABLE) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":1028\n * \n *     if (<memoryview>memviewslice.memview).flags & PyBUF_WRITABLE:\n *         result.flags = PyBUF_RECORDS             # <<<<<<<<<<<<<<\n *     else:\n *         result.flags = PyBUF_RECORDS_RO\n */\n    __pyx_v_result->__pyx_base.flags = PyBUF_RECORDS;\n\n    /* \"View.MemoryView\":1027\n *     Py_INCREF(Py_None)\n * \n *     if (<memoryview>memviewslice.memview).flags & PyBUF_WRITABLE:             # <<<<<<<<<<<<<<\n *         result.flags = PyBUF_RECORDS\n *     else:\n */\n    goto __pyx_L4;\n  }\n\n  /* \"View.MemoryView\":1030\n *         result.flags = PyBUF_RECORDS\n *     else:\n *         result.flags = PyBUF_RECORDS_RO             # <<<<<<<<<<<<<<\n * \n *     result.view.shape = <Py_ssize_t *> result.from_slice.shape\n */\n  /*else*/ {\n    __pyx_v_result->__pyx_base.flags = PyBUF_RECORDS_RO;\n  }\n  __pyx_L4:;\n\n  /* \"View.MemoryView\":1032\n *         result.flags = PyBUF_RECORDS_RO\n * \n *     result.view.shape = <Py_ssize_t *> result.from_slice.shape             # <<<<<<<<<<<<<<\n *     result.view.strides = <Py_ssize_t *> result.from_slice.strides\n * \n */\n  __pyx_v_result->__pyx_base.view.shape = ((Py_ssize_t *)__pyx_v_result->from_slice.shape);\n\n  /* \"View.MemoryView\":1033\n * \n *     result.view.shape = <Py_ssize_t *> result.from_slice.shape\n *     result.view.strides = <Py_ssize_t *> result.from_slice.strides             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_v_result->__pyx_base.view.strides = ((Py_ssize_t *)__pyx_v_result->from_slice.strides);\n\n  /* \"View.MemoryView\":1036\n * \n * \n *     result.view.suboffsets = NULL             # <<<<<<<<<<<<<<\n *     for suboffset in result.from_slice.suboffsets[:ndim]:\n *         if suboffset >= 0:\n */\n  __pyx_v_result->__pyx_base.view.suboffsets = NULL;\n\n  /* \"View.MemoryView\":1037\n * \n *     result.view.suboffsets = NULL\n *     for suboffset in result.from_slice.suboffsets[:ndim]:             # <<<<<<<<<<<<<<\n *         if suboffset >= 0:\n *             result.view.suboffsets = <Py_ssize_t *> result.from_slice.suboffsets\n */\n  __pyx_t_7 = (__pyx_v_result->from_slice.suboffsets + __pyx_v_ndim);\n  for (__pyx_t_8 = __pyx_v_result->from_slice.suboffsets; __pyx_t_8 < __pyx_t_7; __pyx_t_8++) {\n    __pyx_t_6 = __pyx_t_8;\n    __pyx_v_suboffset = (__pyx_t_6[0]);\n\n    /* \"View.MemoryView\":1038\n *     result.view.suboffsets = NULL\n *     for suboffset in result.from_slice.suboffsets[:ndim]:\n *         if suboffset >= 0:             # <<<<<<<<<<<<<<\n *             result.view.suboffsets = <Py_ssize_t *> result.from_slice.suboffsets\n *             break\n */\n    __pyx_t_1 = ((__pyx_v_suboffset >= 0) != 0);\n    if (__pyx_t_1) {\n\n      /* \"View.MemoryView\":1039\n *     for suboffset in result.from_slice.suboffsets[:ndim]:\n *         if suboffset >= 0:\n *             result.view.suboffsets = <Py_ssize_t *> result.from_slice.suboffsets             # <<<<<<<<<<<<<<\n *             break\n * \n */\n      __pyx_v_result->__pyx_base.view.suboffsets = ((Py_ssize_t *)__pyx_v_result->from_slice.suboffsets);\n\n      /* \"View.MemoryView\":1040\n *         if suboffset >= 0:\n *             result.view.suboffsets = <Py_ssize_t *> result.from_slice.suboffsets\n *             break             # <<<<<<<<<<<<<<\n * \n *     result.view.len = result.view.itemsize\n */\n      goto __pyx_L6_break;\n\n      /* \"View.MemoryView\":1038\n *     result.view.suboffsets = NULL\n *     for suboffset in result.from_slice.suboffsets[:ndim]:\n *         if suboffset >= 0:             # <<<<<<<<<<<<<<\n *             result.view.suboffsets = <Py_ssize_t *> result.from_slice.suboffsets\n *             break\n */\n    }\n  }\n  __pyx_L6_break:;\n\n  /* \"View.MemoryView\":1042\n *             break\n * \n *     result.view.len = result.view.itemsize             # <<<<<<<<<<<<<<\n *     for length in result.view.shape[:ndim]:\n *         result.view.len *= length\n */\n  __pyx_t_9 = __pyx_v_result->__pyx_base.view.itemsize;\n  __pyx_v_result->__pyx_base.view.len = __pyx_t_9;\n\n  /* \"View.MemoryView\":1043\n * \n *     result.view.len = result.view.itemsize\n *     for length in result.view.shape[:ndim]:             # <<<<<<<<<<<<<<\n *         result.view.len *= length\n * \n */\n  __pyx_t_7 = (__pyx_v_result->__pyx_base.view.shape + __pyx_v_ndim);\n  for (__pyx_t_8 = __pyx_v_result->__pyx_base.view.shape; __pyx_t_8 < __pyx_t_7; __pyx_t_8++) {\n    __pyx_t_6 = __pyx_t_8;\n    __pyx_t_2 = PyInt_FromSsize_t((__pyx_t_6[0])); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 1043, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __Pyx_XDECREF_SET(__pyx_v_length, __pyx_t_2);\n    __pyx_t_2 = 0;\n\n    /* \"View.MemoryView\":1044\n *     result.view.len = result.view.itemsize\n *     for length in result.view.shape[:ndim]:\n *         result.view.len *= length             # <<<<<<<<<<<<<<\n * \n *     result.to_object_func = to_object_func\n */\n    __pyx_t_2 = PyInt_FromSsize_t(__pyx_v_result->__pyx_base.view.len); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 1044, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __pyx_t_3 = PyNumber_InPlaceMultiply(__pyx_t_2, __pyx_v_length); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 1044, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n    __pyx_t_9 = __Pyx_PyIndex_AsSsize_t(__pyx_t_3); if (unlikely((__pyx_t_9 == (Py_ssize_t)-1) && PyErr_Occurred())) __PYX_ERR(2, 1044, __pyx_L1_error)\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n    __pyx_v_result->__pyx_base.view.len = __pyx_t_9;\n  }\n\n  /* \"View.MemoryView\":1046\n *         result.view.len *= length\n * \n *     result.to_object_func = to_object_func             # <<<<<<<<<<<<<<\n *     result.to_dtype_func = to_dtype_func\n * \n */\n  __pyx_v_result->to_object_func = __pyx_v_to_object_func;\n\n  /* \"View.MemoryView\":1047\n * \n *     result.to_object_func = to_object_func\n *     result.to_dtype_func = to_dtype_func             # <<<<<<<<<<<<<<\n * \n *     return result\n */\n  __pyx_v_result->to_dtype_func = __pyx_v_to_dtype_func;\n\n  /* \"View.MemoryView\":1049\n *     result.to_dtype_func = to_dtype_func\n * \n *     return result             # <<<<<<<<<<<<<<\n * \n * @cname('__pyx_memoryview_get_slice_from_memoryview')\n */\n  __Pyx_XDECREF(__pyx_r);\n  __Pyx_INCREF(((PyObject *)__pyx_v_result));\n  __pyx_r = ((PyObject *)__pyx_v_result);\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":999\n * \n * @cname('__pyx_memoryview_fromslice')\n * cdef memoryview_fromslice(__Pyx_memviewslice memviewslice,             # <<<<<<<<<<<<<<\n *                           int ndim,\n *                           object (*to_object_func)(char *),\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview_fromslice\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XDECREF((PyObject *)__pyx_v_result);\n  __Pyx_XDECREF(__pyx_v_length);\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":1052\n * \n * @cname('__pyx_memoryview_get_slice_from_memoryview')\n * cdef __Pyx_memviewslice *get_slice_from_memview(memoryview memview,             # <<<<<<<<<<<<<<\n *                                                    __Pyx_memviewslice *mslice) except NULL:\n *     cdef _memoryviewslice obj\n */\n\nstatic __Pyx_memviewslice *__pyx_memoryview_get_slice_from_memoryview(struct __pyx_memoryview_obj *__pyx_v_memview, __Pyx_memviewslice *__pyx_v_mslice) {\n  struct __pyx_memoryviewslice_obj *__pyx_v_obj = 0;\n  __Pyx_memviewslice *__pyx_r;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  int __pyx_t_2;\n  PyObject *__pyx_t_3 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"get_slice_from_memview\", 0);\n\n  /* \"View.MemoryView\":1055\n *                                                    __Pyx_memviewslice *mslice) except NULL:\n *     cdef _memoryviewslice obj\n *     if isinstance(memview, _memoryviewslice):             # <<<<<<<<<<<<<<\n *         obj = memview\n *         return &obj.from_slice\n */\n  __pyx_t_1 = __Pyx_TypeCheck(((PyObject *)__pyx_v_memview), __pyx_memoryviewslice_type); \n  __pyx_t_2 = (__pyx_t_1 != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":1056\n *     cdef _memoryviewslice obj\n *     if isinstance(memview, _memoryviewslice):\n *         obj = memview             # <<<<<<<<<<<<<<\n *         return &obj.from_slice\n *     else:\n */\n    if (!(likely(((((PyObject *)__pyx_v_memview)) == Py_None) || likely(__Pyx_TypeTest(((PyObject *)__pyx_v_memview), __pyx_memoryviewslice_type))))) __PYX_ERR(2, 1056, __pyx_L1_error)\n    __pyx_t_3 = ((PyObject *)__pyx_v_memview);\n    __Pyx_INCREF(__pyx_t_3);\n    __pyx_v_obj = ((struct __pyx_memoryviewslice_obj *)__pyx_t_3);\n    __pyx_t_3 = 0;\n\n    /* \"View.MemoryView\":1057\n *     if isinstance(memview, _memoryviewslice):\n *         obj = memview\n *         return &obj.from_slice             # <<<<<<<<<<<<<<\n *     else:\n *         slice_copy(memview, mslice)\n */\n    __pyx_r = (&__pyx_v_obj->from_slice);\n    goto __pyx_L0;\n\n    /* \"View.MemoryView\":1055\n *                                                    __Pyx_memviewslice *mslice) except NULL:\n *     cdef _memoryviewslice obj\n *     if isinstance(memview, _memoryviewslice):             # <<<<<<<<<<<<<<\n *         obj = memview\n *         return &obj.from_slice\n */\n  }\n\n  /* \"View.MemoryView\":1059\n *         return &obj.from_slice\n *     else:\n *         slice_copy(memview, mslice)             # <<<<<<<<<<<<<<\n *         return mslice\n * \n */\n  /*else*/ {\n    __pyx_memoryview_slice_copy(__pyx_v_memview, __pyx_v_mslice);\n\n    /* \"View.MemoryView\":1060\n *     else:\n *         slice_copy(memview, mslice)\n *         return mslice             # <<<<<<<<<<<<<<\n * \n * @cname('__pyx_memoryview_slice_copy')\n */\n    __pyx_r = __pyx_v_mslice;\n    goto __pyx_L0;\n  }\n\n  /* \"View.MemoryView\":1052\n * \n * @cname('__pyx_memoryview_get_slice_from_memoryview')\n * cdef __Pyx_memviewslice *get_slice_from_memview(memoryview memview,             # <<<<<<<<<<<<<<\n *                                                    __Pyx_memviewslice *mslice) except NULL:\n *     cdef _memoryviewslice obj\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_AddTraceback(\"View.MemoryView.get_slice_from_memview\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XDECREF((PyObject *)__pyx_v_obj);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":1063\n * \n * @cname('__pyx_memoryview_slice_copy')\n * cdef void slice_copy(memoryview memview, __Pyx_memviewslice *dst):             # <<<<<<<<<<<<<<\n *     cdef int dim\n *     cdef (Py_ssize_t*) shape, strides, suboffsets\n */\n\nstatic void __pyx_memoryview_slice_copy(struct __pyx_memoryview_obj *__pyx_v_memview, __Pyx_memviewslice *__pyx_v_dst) {\n  int __pyx_v_dim;\n  Py_ssize_t *__pyx_v_shape;\n  Py_ssize_t *__pyx_v_strides;\n  Py_ssize_t *__pyx_v_suboffsets;\n  __Pyx_RefNannyDeclarations\n  Py_ssize_t *__pyx_t_1;\n  int __pyx_t_2;\n  int __pyx_t_3;\n  int __pyx_t_4;\n  Py_ssize_t __pyx_t_5;\n  __Pyx_RefNannySetupContext(\"slice_copy\", 0);\n\n  /* \"View.MemoryView\":1067\n *     cdef (Py_ssize_t*) shape, strides, suboffsets\n * \n *     shape = memview.view.shape             # <<<<<<<<<<<<<<\n *     strides = memview.view.strides\n *     suboffsets = memview.view.suboffsets\n */\n  __pyx_t_1 = __pyx_v_memview->view.shape;\n  __pyx_v_shape = __pyx_t_1;\n\n  /* \"View.MemoryView\":1068\n * \n *     shape = memview.view.shape\n *     strides = memview.view.strides             # <<<<<<<<<<<<<<\n *     suboffsets = memview.view.suboffsets\n * \n */\n  __pyx_t_1 = __pyx_v_memview->view.strides;\n  __pyx_v_strides = __pyx_t_1;\n\n  /* \"View.MemoryView\":1069\n *     shape = memview.view.shape\n *     strides = memview.view.strides\n *     suboffsets = memview.view.suboffsets             # <<<<<<<<<<<<<<\n * \n *     dst.memview = <__pyx_memoryview *> memview\n */\n  __pyx_t_1 = __pyx_v_memview->view.suboffsets;\n  __pyx_v_suboffsets = __pyx_t_1;\n\n  /* \"View.MemoryView\":1071\n *     suboffsets = memview.view.suboffsets\n * \n *     dst.memview = <__pyx_memoryview *> memview             # <<<<<<<<<<<<<<\n *     dst.data = <char *> memview.view.buf\n * \n */\n  __pyx_v_dst->memview = ((struct __pyx_memoryview_obj *)__pyx_v_memview);\n\n  /* \"View.MemoryView\":1072\n * \n *     dst.memview = <__pyx_memoryview *> memview\n *     dst.data = <char *> memview.view.buf             # <<<<<<<<<<<<<<\n * \n *     for dim in range(memview.view.ndim):\n */\n  __pyx_v_dst->data = ((char *)__pyx_v_memview->view.buf);\n\n  /* \"View.MemoryView\":1074\n *     dst.data = <char *> memview.view.buf\n * \n *     for dim in range(memview.view.ndim):             # <<<<<<<<<<<<<<\n *         dst.shape[dim] = shape[dim]\n *         dst.strides[dim] = strides[dim]\n */\n  __pyx_t_2 = __pyx_v_memview->view.ndim;\n  __pyx_t_3 = __pyx_t_2;\n  for (__pyx_t_4 = 0; __pyx_t_4 < __pyx_t_3; __pyx_t_4+=1) {\n    __pyx_v_dim = __pyx_t_4;\n\n    /* \"View.MemoryView\":1075\n * \n *     for dim in range(memview.view.ndim):\n *         dst.shape[dim] = shape[dim]             # <<<<<<<<<<<<<<\n *         dst.strides[dim] = strides[dim]\n *         dst.suboffsets[dim] = suboffsets[dim] if suboffsets else -1\n */\n    (__pyx_v_dst->shape[__pyx_v_dim]) = (__pyx_v_shape[__pyx_v_dim]);\n\n    /* \"View.MemoryView\":1076\n *     for dim in range(memview.view.ndim):\n *         dst.shape[dim] = shape[dim]\n *         dst.strides[dim] = strides[dim]             # <<<<<<<<<<<<<<\n *         dst.suboffsets[dim] = suboffsets[dim] if suboffsets else -1\n * \n */\n    (__pyx_v_dst->strides[__pyx_v_dim]) = (__pyx_v_strides[__pyx_v_dim]);\n\n    /* \"View.MemoryView\":1077\n *         dst.shape[dim] = shape[dim]\n *         dst.strides[dim] = strides[dim]\n *         dst.suboffsets[dim] = suboffsets[dim] if suboffsets else -1             # <<<<<<<<<<<<<<\n * \n * @cname('__pyx_memoryview_copy_object')\n */\n    if ((__pyx_v_suboffsets != 0)) {\n      __pyx_t_5 = (__pyx_v_suboffsets[__pyx_v_dim]);\n    } else {\n      __pyx_t_5 = -1L;\n    }\n    (__pyx_v_dst->suboffsets[__pyx_v_dim]) = __pyx_t_5;\n  }\n\n  /* \"View.MemoryView\":1063\n * \n * @cname('__pyx_memoryview_slice_copy')\n * cdef void slice_copy(memoryview memview, __Pyx_memviewslice *dst):             # <<<<<<<<<<<<<<\n *     cdef int dim\n *     cdef (Py_ssize_t*) shape, strides, suboffsets\n */\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n}\n\n/* \"View.MemoryView\":1080\n * \n * @cname('__pyx_memoryview_copy_object')\n * cdef memoryview_copy(memoryview memview):             # <<<<<<<<<<<<<<\n *     \"Create a new memoryview object\"\n *     cdef __Pyx_memviewslice memviewslice\n */\n\nstatic PyObject *__pyx_memoryview_copy_object(struct __pyx_memoryview_obj *__pyx_v_memview) {\n  __Pyx_memviewslice __pyx_v_memviewslice;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"memoryview_copy\", 0);\n\n  /* \"View.MemoryView\":1083\n *     \"Create a new memoryview object\"\n *     cdef __Pyx_memviewslice memviewslice\n *     slice_copy(memview, &memviewslice)             # <<<<<<<<<<<<<<\n *     return memoryview_copy_from_slice(memview, &memviewslice)\n * \n */\n  __pyx_memoryview_slice_copy(__pyx_v_memview, (&__pyx_v_memviewslice));\n\n  /* \"View.MemoryView\":1084\n *     cdef __Pyx_memviewslice memviewslice\n *     slice_copy(memview, &memviewslice)\n *     return memoryview_copy_from_slice(memview, &memviewslice)             # <<<<<<<<<<<<<<\n * \n * @cname('__pyx_memoryview_copy_object_from_slice')\n */\n  __Pyx_XDECREF(__pyx_r);\n  __pyx_t_1 = __pyx_memoryview_copy_object_from_slice(__pyx_v_memview, (&__pyx_v_memviewslice)); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 1084, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_r = __pyx_t_1;\n  __pyx_t_1 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":1080\n * \n * @cname('__pyx_memoryview_copy_object')\n * cdef memoryview_copy(memoryview memview):             # <<<<<<<<<<<<<<\n *     \"Create a new memoryview object\"\n *     cdef __Pyx_memviewslice memviewslice\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview_copy\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":1087\n * \n * @cname('__pyx_memoryview_copy_object_from_slice')\n * cdef memoryview_copy_from_slice(memoryview memview, __Pyx_memviewslice *memviewslice):             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     Create a new memoryview object from a given memoryview object and slice.\n */\n\nstatic PyObject *__pyx_memoryview_copy_object_from_slice(struct __pyx_memoryview_obj *__pyx_v_memview, __Pyx_memviewslice *__pyx_v_memviewslice) {\n  PyObject *(*__pyx_v_to_object_func)(char *);\n  int (*__pyx_v_to_dtype_func)(char *, PyObject *);\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  int __pyx_t_2;\n  PyObject *(*__pyx_t_3)(char *);\n  int (*__pyx_t_4)(char *, PyObject *);\n  PyObject *__pyx_t_5 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"memoryview_copy_from_slice\", 0);\n\n  /* \"View.MemoryView\":1094\n *     cdef int (*to_dtype_func)(char *, object) except 0\n * \n *     if isinstance(memview, _memoryviewslice):             # <<<<<<<<<<<<<<\n *         to_object_func = (<_memoryviewslice> memview).to_object_func\n *         to_dtype_func = (<_memoryviewslice> memview).to_dtype_func\n */\n  __pyx_t_1 = __Pyx_TypeCheck(((PyObject *)__pyx_v_memview), __pyx_memoryviewslice_type); \n  __pyx_t_2 = (__pyx_t_1 != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":1095\n * \n *     if isinstance(memview, _memoryviewslice):\n *         to_object_func = (<_memoryviewslice> memview).to_object_func             # <<<<<<<<<<<<<<\n *         to_dtype_func = (<_memoryviewslice> memview).to_dtype_func\n *     else:\n */\n    __pyx_t_3 = ((struct __pyx_memoryviewslice_obj *)__pyx_v_memview)->to_object_func;\n    __pyx_v_to_object_func = __pyx_t_3;\n\n    /* \"View.MemoryView\":1096\n *     if isinstance(memview, _memoryviewslice):\n *         to_object_func = (<_memoryviewslice> memview).to_object_func\n *         to_dtype_func = (<_memoryviewslice> memview).to_dtype_func             # <<<<<<<<<<<<<<\n *     else:\n *         to_object_func = NULL\n */\n    __pyx_t_4 = ((struct __pyx_memoryviewslice_obj *)__pyx_v_memview)->to_dtype_func;\n    __pyx_v_to_dtype_func = __pyx_t_4;\n\n    /* \"View.MemoryView\":1094\n *     cdef int (*to_dtype_func)(char *, object) except 0\n * \n *     if isinstance(memview, _memoryviewslice):             # <<<<<<<<<<<<<<\n *         to_object_func = (<_memoryviewslice> memview).to_object_func\n *         to_dtype_func = (<_memoryviewslice> memview).to_dtype_func\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":1098\n *         to_dtype_func = (<_memoryviewslice> memview).to_dtype_func\n *     else:\n *         to_object_func = NULL             # <<<<<<<<<<<<<<\n *         to_dtype_func = NULL\n * \n */\n  /*else*/ {\n    __pyx_v_to_object_func = NULL;\n\n    /* \"View.MemoryView\":1099\n *     else:\n *         to_object_func = NULL\n *         to_dtype_func = NULL             # <<<<<<<<<<<<<<\n * \n *     return memoryview_fromslice(memviewslice[0], memview.view.ndim,\n */\n    __pyx_v_to_dtype_func = NULL;\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":1101\n *         to_dtype_func = NULL\n * \n *     return memoryview_fromslice(memviewslice[0], memview.view.ndim,             # <<<<<<<<<<<<<<\n *                                 to_object_func, to_dtype_func,\n *                                 memview.dtype_is_object)\n */\n  __Pyx_XDECREF(__pyx_r);\n\n  /* \"View.MemoryView\":1103\n *     return memoryview_fromslice(memviewslice[0], memview.view.ndim,\n *                                 to_object_func, to_dtype_func,\n *                                 memview.dtype_is_object)             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_t_5 = __pyx_memoryview_fromslice((__pyx_v_memviewslice[0]), __pyx_v_memview->view.ndim, __pyx_v_to_object_func, __pyx_v_to_dtype_func, __pyx_v_memview->dtype_is_object); if (unlikely(!__pyx_t_5)) __PYX_ERR(2, 1101, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_5);\n  __pyx_r = __pyx_t_5;\n  __pyx_t_5 = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":1087\n * \n * @cname('__pyx_memoryview_copy_object_from_slice')\n * cdef memoryview_copy_from_slice(memoryview memview, __Pyx_memviewslice *memviewslice):             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     Create a new memoryview object from a given memoryview object and slice.\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_AddTraceback(\"View.MemoryView.memoryview_copy_from_slice\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":1109\n * \n * \n * cdef Py_ssize_t abs_py_ssize_t(Py_ssize_t arg) nogil:             # <<<<<<<<<<<<<<\n *     if arg < 0:\n *         return -arg\n */\n\nstatic Py_ssize_t abs_py_ssize_t(Py_ssize_t __pyx_v_arg) {\n  Py_ssize_t __pyx_r;\n  int __pyx_t_1;\n\n  /* \"View.MemoryView\":1110\n * \n * cdef Py_ssize_t abs_py_ssize_t(Py_ssize_t arg) nogil:\n *     if arg < 0:             # <<<<<<<<<<<<<<\n *         return -arg\n *     else:\n */\n  __pyx_t_1 = ((__pyx_v_arg < 0) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":1111\n * cdef Py_ssize_t abs_py_ssize_t(Py_ssize_t arg) nogil:\n *     if arg < 0:\n *         return -arg             # <<<<<<<<<<<<<<\n *     else:\n *         return arg\n */\n    __pyx_r = (-__pyx_v_arg);\n    goto __pyx_L0;\n\n    /* \"View.MemoryView\":1110\n * \n * cdef Py_ssize_t abs_py_ssize_t(Py_ssize_t arg) nogil:\n *     if arg < 0:             # <<<<<<<<<<<<<<\n *         return -arg\n *     else:\n */\n  }\n\n  /* \"View.MemoryView\":1113\n *         return -arg\n *     else:\n *         return arg             # <<<<<<<<<<<<<<\n * \n * @cname('__pyx_get_best_slice_order')\n */\n  /*else*/ {\n    __pyx_r = __pyx_v_arg;\n    goto __pyx_L0;\n  }\n\n  /* \"View.MemoryView\":1109\n * \n * \n * cdef Py_ssize_t abs_py_ssize_t(Py_ssize_t arg) nogil:             # <<<<<<<<<<<<<<\n *     if arg < 0:\n *         return -arg\n */\n\n  /* function exit code */\n  __pyx_L0:;\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":1116\n * \n * @cname('__pyx_get_best_slice_order')\n * cdef char get_best_order(__Pyx_memviewslice *mslice, int ndim) nogil:             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     Figure out the best memory access order for a given slice.\n */\n\nstatic char __pyx_get_best_slice_order(__Pyx_memviewslice *__pyx_v_mslice, int __pyx_v_ndim) {\n  int __pyx_v_i;\n  Py_ssize_t __pyx_v_c_stride;\n  Py_ssize_t __pyx_v_f_stride;\n  char __pyx_r;\n  int __pyx_t_1;\n  int __pyx_t_2;\n  int __pyx_t_3;\n  int __pyx_t_4;\n\n  /* \"View.MemoryView\":1121\n *     \"\"\"\n *     cdef int i\n *     cdef Py_ssize_t c_stride = 0             # <<<<<<<<<<<<<<\n *     cdef Py_ssize_t f_stride = 0\n * \n */\n  __pyx_v_c_stride = 0;\n\n  /* \"View.MemoryView\":1122\n *     cdef int i\n *     cdef Py_ssize_t c_stride = 0\n *     cdef Py_ssize_t f_stride = 0             # <<<<<<<<<<<<<<\n * \n *     for i in range(ndim - 1, -1, -1):\n */\n  __pyx_v_f_stride = 0;\n\n  /* \"View.MemoryView\":1124\n *     cdef Py_ssize_t f_stride = 0\n * \n *     for i in range(ndim - 1, -1, -1):             # <<<<<<<<<<<<<<\n *         if mslice.shape[i] > 1:\n *             c_stride = mslice.strides[i]\n */\n  for (__pyx_t_1 = (__pyx_v_ndim - 1); __pyx_t_1 > -1; __pyx_t_1-=1) {\n    __pyx_v_i = __pyx_t_1;\n\n    /* \"View.MemoryView\":1125\n * \n *     for i in range(ndim - 1, -1, -1):\n *         if mslice.shape[i] > 1:             # <<<<<<<<<<<<<<\n *             c_stride = mslice.strides[i]\n *             break\n */\n    __pyx_t_2 = (((__pyx_v_mslice->shape[__pyx_v_i]) > 1) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":1126\n *     for i in range(ndim - 1, -1, -1):\n *         if mslice.shape[i] > 1:\n *             c_stride = mslice.strides[i]             # <<<<<<<<<<<<<<\n *             break\n * \n */\n      __pyx_v_c_stride = (__pyx_v_mslice->strides[__pyx_v_i]);\n\n      /* \"View.MemoryView\":1127\n *         if mslice.shape[i] > 1:\n *             c_stride = mslice.strides[i]\n *             break             # <<<<<<<<<<<<<<\n * \n *     for i in range(ndim):\n */\n      goto __pyx_L4_break;\n\n      /* \"View.MemoryView\":1125\n * \n *     for i in range(ndim - 1, -1, -1):\n *         if mslice.shape[i] > 1:             # <<<<<<<<<<<<<<\n *             c_stride = mslice.strides[i]\n *             break\n */\n    }\n  }\n  __pyx_L4_break:;\n\n  /* \"View.MemoryView\":1129\n *             break\n * \n *     for i in range(ndim):             # <<<<<<<<<<<<<<\n *         if mslice.shape[i] > 1:\n *             f_stride = mslice.strides[i]\n */\n  __pyx_t_1 = __pyx_v_ndim;\n  __pyx_t_3 = __pyx_t_1;\n  for (__pyx_t_4 = 0; __pyx_t_4 < __pyx_t_3; __pyx_t_4+=1) {\n    __pyx_v_i = __pyx_t_4;\n\n    /* \"View.MemoryView\":1130\n * \n *     for i in range(ndim):\n *         if mslice.shape[i] > 1:             # <<<<<<<<<<<<<<\n *             f_stride = mslice.strides[i]\n *             break\n */\n    __pyx_t_2 = (((__pyx_v_mslice->shape[__pyx_v_i]) > 1) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":1131\n *     for i in range(ndim):\n *         if mslice.shape[i] > 1:\n *             f_stride = mslice.strides[i]             # <<<<<<<<<<<<<<\n *             break\n * \n */\n      __pyx_v_f_stride = (__pyx_v_mslice->strides[__pyx_v_i]);\n\n      /* \"View.MemoryView\":1132\n *         if mslice.shape[i] > 1:\n *             f_stride = mslice.strides[i]\n *             break             # <<<<<<<<<<<<<<\n * \n *     if abs_py_ssize_t(c_stride) <= abs_py_ssize_t(f_stride):\n */\n      goto __pyx_L7_break;\n\n      /* \"View.MemoryView\":1130\n * \n *     for i in range(ndim):\n *         if mslice.shape[i] > 1:             # <<<<<<<<<<<<<<\n *             f_stride = mslice.strides[i]\n *             break\n */\n    }\n  }\n  __pyx_L7_break:;\n\n  /* \"View.MemoryView\":1134\n *             break\n * \n *     if abs_py_ssize_t(c_stride) <= abs_py_ssize_t(f_stride):             # <<<<<<<<<<<<<<\n *         return 'C'\n *     else:\n */\n  __pyx_t_2 = ((abs_py_ssize_t(__pyx_v_c_stride) <= abs_py_ssize_t(__pyx_v_f_stride)) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":1135\n * \n *     if abs_py_ssize_t(c_stride) <= abs_py_ssize_t(f_stride):\n *         return 'C'             # <<<<<<<<<<<<<<\n *     else:\n *         return 'F'\n */\n    __pyx_r = 'C';\n    goto __pyx_L0;\n\n    /* \"View.MemoryView\":1134\n *             break\n * \n *     if abs_py_ssize_t(c_stride) <= abs_py_ssize_t(f_stride):             # <<<<<<<<<<<<<<\n *         return 'C'\n *     else:\n */\n  }\n\n  /* \"View.MemoryView\":1137\n *         return 'C'\n *     else:\n *         return 'F'             # <<<<<<<<<<<<<<\n * \n * @cython.cdivision(True)\n */\n  /*else*/ {\n    __pyx_r = 'F';\n    goto __pyx_L0;\n  }\n\n  /* \"View.MemoryView\":1116\n * \n * @cname('__pyx_get_best_slice_order')\n * cdef char get_best_order(__Pyx_memviewslice *mslice, int ndim) nogil:             # <<<<<<<<<<<<<<\n *     \"\"\"\n *     Figure out the best memory access order for a given slice.\n */\n\n  /* function exit code */\n  __pyx_L0:;\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":1140\n * \n * @cython.cdivision(True)\n * cdef void _copy_strided_to_strided(char *src_data, Py_ssize_t *src_strides,             # <<<<<<<<<<<<<<\n *                                    char *dst_data, Py_ssize_t *dst_strides,\n *                                    Py_ssize_t *src_shape, Py_ssize_t *dst_shape,\n */\n\nstatic void _copy_strided_to_strided(char *__pyx_v_src_data, Py_ssize_t *__pyx_v_src_strides, char *__pyx_v_dst_data, Py_ssize_t *__pyx_v_dst_strides, Py_ssize_t *__pyx_v_src_shape, Py_ssize_t *__pyx_v_dst_shape, int __pyx_v_ndim, size_t __pyx_v_itemsize) {\n  CYTHON_UNUSED Py_ssize_t __pyx_v_i;\n  CYTHON_UNUSED Py_ssize_t __pyx_v_src_extent;\n  Py_ssize_t __pyx_v_dst_extent;\n  Py_ssize_t __pyx_v_src_stride;\n  Py_ssize_t __pyx_v_dst_stride;\n  int __pyx_t_1;\n  int __pyx_t_2;\n  int __pyx_t_3;\n  Py_ssize_t __pyx_t_4;\n  Py_ssize_t __pyx_t_5;\n  Py_ssize_t __pyx_t_6;\n\n  /* \"View.MemoryView\":1147\n * \n *     cdef Py_ssize_t i\n *     cdef Py_ssize_t src_extent = src_shape[0]             # <<<<<<<<<<<<<<\n *     cdef Py_ssize_t dst_extent = dst_shape[0]\n *     cdef Py_ssize_t src_stride = src_strides[0]\n */\n  __pyx_v_src_extent = (__pyx_v_src_shape[0]);\n\n  /* \"View.MemoryView\":1148\n *     cdef Py_ssize_t i\n *     cdef Py_ssize_t src_extent = src_shape[0]\n *     cdef Py_ssize_t dst_extent = dst_shape[0]             # <<<<<<<<<<<<<<\n *     cdef Py_ssize_t src_stride = src_strides[0]\n *     cdef Py_ssize_t dst_stride = dst_strides[0]\n */\n  __pyx_v_dst_extent = (__pyx_v_dst_shape[0]);\n\n  /* \"View.MemoryView\":1149\n *     cdef Py_ssize_t src_extent = src_shape[0]\n *     cdef Py_ssize_t dst_extent = dst_shape[0]\n *     cdef Py_ssize_t src_stride = src_strides[0]             # <<<<<<<<<<<<<<\n *     cdef Py_ssize_t dst_stride = dst_strides[0]\n * \n */\n  __pyx_v_src_stride = (__pyx_v_src_strides[0]);\n\n  /* \"View.MemoryView\":1150\n *     cdef Py_ssize_t dst_extent = dst_shape[0]\n *     cdef Py_ssize_t src_stride = src_strides[0]\n *     cdef Py_ssize_t dst_stride = dst_strides[0]             # <<<<<<<<<<<<<<\n * \n *     if ndim == 1:\n */\n  __pyx_v_dst_stride = (__pyx_v_dst_strides[0]);\n\n  /* \"View.MemoryView\":1152\n *     cdef Py_ssize_t dst_stride = dst_strides[0]\n * \n *     if ndim == 1:             # <<<<<<<<<<<<<<\n *        if (src_stride > 0 and dst_stride > 0 and\n *            <size_t> src_stride == itemsize == <size_t> dst_stride):\n */\n  __pyx_t_1 = ((__pyx_v_ndim == 1) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":1153\n * \n *     if ndim == 1:\n *        if (src_stride > 0 and dst_stride > 0 and             # <<<<<<<<<<<<<<\n *            <size_t> src_stride == itemsize == <size_t> dst_stride):\n *            memcpy(dst_data, src_data, itemsize * dst_extent)\n */\n    __pyx_t_2 = ((__pyx_v_src_stride > 0) != 0);\n    if (__pyx_t_2) {\n    } else {\n      __pyx_t_1 = __pyx_t_2;\n      goto __pyx_L5_bool_binop_done;\n    }\n    __pyx_t_2 = ((__pyx_v_dst_stride > 0) != 0);\n    if (__pyx_t_2) {\n    } else {\n      __pyx_t_1 = __pyx_t_2;\n      goto __pyx_L5_bool_binop_done;\n    }\n\n    /* \"View.MemoryView\":1154\n *     if ndim == 1:\n *        if (src_stride > 0 and dst_stride > 0 and\n *            <size_t> src_stride == itemsize == <size_t> dst_stride):             # <<<<<<<<<<<<<<\n *            memcpy(dst_data, src_data, itemsize * dst_extent)\n *        else:\n */\n    __pyx_t_2 = (((size_t)__pyx_v_src_stride) == __pyx_v_itemsize);\n    if (__pyx_t_2) {\n      __pyx_t_2 = (__pyx_v_itemsize == ((size_t)__pyx_v_dst_stride));\n    }\n    __pyx_t_3 = (__pyx_t_2 != 0);\n    __pyx_t_1 = __pyx_t_3;\n    __pyx_L5_bool_binop_done:;\n\n    /* \"View.MemoryView\":1153\n * \n *     if ndim == 1:\n *        if (src_stride > 0 and dst_stride > 0 and             # <<<<<<<<<<<<<<\n *            <size_t> src_stride == itemsize == <size_t> dst_stride):\n *            memcpy(dst_data, src_data, itemsize * dst_extent)\n */\n    if (__pyx_t_1) {\n\n      /* \"View.MemoryView\":1155\n *        if (src_stride > 0 and dst_stride > 0 and\n *            <size_t> src_stride == itemsize == <size_t> dst_stride):\n *            memcpy(dst_data, src_data, itemsize * dst_extent)             # <<<<<<<<<<<<<<\n *        else:\n *            for i in range(dst_extent):\n */\n      (void)(memcpy(__pyx_v_dst_data, __pyx_v_src_data, (__pyx_v_itemsize * __pyx_v_dst_extent)));\n\n      /* \"View.MemoryView\":1153\n * \n *     if ndim == 1:\n *        if (src_stride > 0 and dst_stride > 0 and             # <<<<<<<<<<<<<<\n *            <size_t> src_stride == itemsize == <size_t> dst_stride):\n *            memcpy(dst_data, src_data, itemsize * dst_extent)\n */\n      goto __pyx_L4;\n    }\n\n    /* \"View.MemoryView\":1157\n *            memcpy(dst_data, src_data, itemsize * dst_extent)\n *        else:\n *            for i in range(dst_extent):             # <<<<<<<<<<<<<<\n *                memcpy(dst_data, src_data, itemsize)\n *                src_data += src_stride\n */\n    /*else*/ {\n      __pyx_t_4 = __pyx_v_dst_extent;\n      __pyx_t_5 = __pyx_t_4;\n      for (__pyx_t_6 = 0; __pyx_t_6 < __pyx_t_5; __pyx_t_6+=1) {\n        __pyx_v_i = __pyx_t_6;\n\n        /* \"View.MemoryView\":1158\n *        else:\n *            for i in range(dst_extent):\n *                memcpy(dst_data, src_data, itemsize)             # <<<<<<<<<<<<<<\n *                src_data += src_stride\n *                dst_data += dst_stride\n */\n        (void)(memcpy(__pyx_v_dst_data, __pyx_v_src_data, __pyx_v_itemsize));\n\n        /* \"View.MemoryView\":1159\n *            for i in range(dst_extent):\n *                memcpy(dst_data, src_data, itemsize)\n *                src_data += src_stride             # <<<<<<<<<<<<<<\n *                dst_data += dst_stride\n *     else:\n */\n        __pyx_v_src_data = (__pyx_v_src_data + __pyx_v_src_stride);\n\n        /* \"View.MemoryView\":1160\n *                memcpy(dst_data, src_data, itemsize)\n *                src_data += src_stride\n *                dst_data += dst_stride             # <<<<<<<<<<<<<<\n *     else:\n *         for i in range(dst_extent):\n */\n        __pyx_v_dst_data = (__pyx_v_dst_data + __pyx_v_dst_stride);\n      }\n    }\n    __pyx_L4:;\n\n    /* \"View.MemoryView\":1152\n *     cdef Py_ssize_t dst_stride = dst_strides[0]\n * \n *     if ndim == 1:             # <<<<<<<<<<<<<<\n *        if (src_stride > 0 and dst_stride > 0 and\n *            <size_t> src_stride == itemsize == <size_t> dst_stride):\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":1162\n *                dst_data += dst_stride\n *     else:\n *         for i in range(dst_extent):             # <<<<<<<<<<<<<<\n *             _copy_strided_to_strided(src_data, src_strides + 1,\n *                                      dst_data, dst_strides + 1,\n */\n  /*else*/ {\n    __pyx_t_4 = __pyx_v_dst_extent;\n    __pyx_t_5 = __pyx_t_4;\n    for (__pyx_t_6 = 0; __pyx_t_6 < __pyx_t_5; __pyx_t_6+=1) {\n      __pyx_v_i = __pyx_t_6;\n\n      /* \"View.MemoryView\":1163\n *     else:\n *         for i in range(dst_extent):\n *             _copy_strided_to_strided(src_data, src_strides + 1,             # <<<<<<<<<<<<<<\n *                                      dst_data, dst_strides + 1,\n *                                      src_shape + 1, dst_shape + 1,\n */\n      _copy_strided_to_strided(__pyx_v_src_data, (__pyx_v_src_strides + 1), __pyx_v_dst_data, (__pyx_v_dst_strides + 1), (__pyx_v_src_shape + 1), (__pyx_v_dst_shape + 1), (__pyx_v_ndim - 1), __pyx_v_itemsize);\n\n      /* \"View.MemoryView\":1167\n *                                      src_shape + 1, dst_shape + 1,\n *                                      ndim - 1, itemsize)\n *             src_data += src_stride             # <<<<<<<<<<<<<<\n *             dst_data += dst_stride\n * \n */\n      __pyx_v_src_data = (__pyx_v_src_data + __pyx_v_src_stride);\n\n      /* \"View.MemoryView\":1168\n *                                      ndim - 1, itemsize)\n *             src_data += src_stride\n *             dst_data += dst_stride             # <<<<<<<<<<<<<<\n * \n * cdef void copy_strided_to_strided(__Pyx_memviewslice *src,\n */\n      __pyx_v_dst_data = (__pyx_v_dst_data + __pyx_v_dst_stride);\n    }\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":1140\n * \n * @cython.cdivision(True)\n * cdef void _copy_strided_to_strided(char *src_data, Py_ssize_t *src_strides,             # <<<<<<<<<<<<<<\n *                                    char *dst_data, Py_ssize_t *dst_strides,\n *                                    Py_ssize_t *src_shape, Py_ssize_t *dst_shape,\n */\n\n  /* function exit code */\n}\n\n/* \"View.MemoryView\":1170\n *             dst_data += dst_stride\n * \n * cdef void copy_strided_to_strided(__Pyx_memviewslice *src,             # <<<<<<<<<<<<<<\n *                                   __Pyx_memviewslice *dst,\n *                                   int ndim, size_t itemsize) nogil:\n */\n\nstatic void copy_strided_to_strided(__Pyx_memviewslice *__pyx_v_src, __Pyx_memviewslice *__pyx_v_dst, int __pyx_v_ndim, size_t __pyx_v_itemsize) {\n\n  /* \"View.MemoryView\":1173\n *                                   __Pyx_memviewslice *dst,\n *                                   int ndim, size_t itemsize) nogil:\n *     _copy_strided_to_strided(src.data, src.strides, dst.data, dst.strides,             # <<<<<<<<<<<<<<\n *                              src.shape, dst.shape, ndim, itemsize)\n * \n */\n  _copy_strided_to_strided(__pyx_v_src->data, __pyx_v_src->strides, __pyx_v_dst->data, __pyx_v_dst->strides, __pyx_v_src->shape, __pyx_v_dst->shape, __pyx_v_ndim, __pyx_v_itemsize);\n\n  /* \"View.MemoryView\":1170\n *             dst_data += dst_stride\n * \n * cdef void copy_strided_to_strided(__Pyx_memviewslice *src,             # <<<<<<<<<<<<<<\n *                                   __Pyx_memviewslice *dst,\n *                                   int ndim, size_t itemsize) nogil:\n */\n\n  /* function exit code */\n}\n\n/* \"View.MemoryView\":1177\n * \n * @cname('__pyx_memoryview_slice_get_size')\n * cdef Py_ssize_t slice_get_size(__Pyx_memviewslice *src, int ndim) nogil:             # <<<<<<<<<<<<<<\n *     \"Return the size of the memory occupied by the slice in number of bytes\"\n *     cdef Py_ssize_t shape, size = src.memview.view.itemsize\n */\n\nstatic Py_ssize_t __pyx_memoryview_slice_get_size(__Pyx_memviewslice *__pyx_v_src, int __pyx_v_ndim) {\n  Py_ssize_t __pyx_v_shape;\n  Py_ssize_t __pyx_v_size;\n  Py_ssize_t __pyx_r;\n  Py_ssize_t __pyx_t_1;\n  Py_ssize_t *__pyx_t_2;\n  Py_ssize_t *__pyx_t_3;\n  Py_ssize_t *__pyx_t_4;\n\n  /* \"View.MemoryView\":1179\n * cdef Py_ssize_t slice_get_size(__Pyx_memviewslice *src, int ndim) nogil:\n *     \"Return the size of the memory occupied by the slice in number of bytes\"\n *     cdef Py_ssize_t shape, size = src.memview.view.itemsize             # <<<<<<<<<<<<<<\n * \n *     for shape in src.shape[:ndim]:\n */\n  __pyx_t_1 = __pyx_v_src->memview->view.itemsize;\n  __pyx_v_size = __pyx_t_1;\n\n  /* \"View.MemoryView\":1181\n *     cdef Py_ssize_t shape, size = src.memview.view.itemsize\n * \n *     for shape in src.shape[:ndim]:             # <<<<<<<<<<<<<<\n *         size *= shape\n * \n */\n  __pyx_t_3 = (__pyx_v_src->shape + __pyx_v_ndim);\n  for (__pyx_t_4 = __pyx_v_src->shape; __pyx_t_4 < __pyx_t_3; __pyx_t_4++) {\n    __pyx_t_2 = __pyx_t_4;\n    __pyx_v_shape = (__pyx_t_2[0]);\n\n    /* \"View.MemoryView\":1182\n * \n *     for shape in src.shape[:ndim]:\n *         size *= shape             # <<<<<<<<<<<<<<\n * \n *     return size\n */\n    __pyx_v_size = (__pyx_v_size * __pyx_v_shape);\n  }\n\n  /* \"View.MemoryView\":1184\n *         size *= shape\n * \n *     return size             # <<<<<<<<<<<<<<\n * \n * @cname('__pyx_fill_contig_strides_array')\n */\n  __pyx_r = __pyx_v_size;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":1177\n * \n * @cname('__pyx_memoryview_slice_get_size')\n * cdef Py_ssize_t slice_get_size(__Pyx_memviewslice *src, int ndim) nogil:             # <<<<<<<<<<<<<<\n *     \"Return the size of the memory occupied by the slice in number of bytes\"\n *     cdef Py_ssize_t shape, size = src.memview.view.itemsize\n */\n\n  /* function exit code */\n  __pyx_L0:;\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":1187\n * \n * @cname('__pyx_fill_contig_strides_array')\n * cdef Py_ssize_t fill_contig_strides_array(             # <<<<<<<<<<<<<<\n *                 Py_ssize_t *shape, Py_ssize_t *strides, Py_ssize_t stride,\n *                 int ndim, char order) nogil:\n */\n\nstatic Py_ssize_t __pyx_fill_contig_strides_array(Py_ssize_t *__pyx_v_shape, Py_ssize_t *__pyx_v_strides, Py_ssize_t __pyx_v_stride, int __pyx_v_ndim, char __pyx_v_order) {\n  int __pyx_v_idx;\n  Py_ssize_t __pyx_r;\n  int __pyx_t_1;\n  int __pyx_t_2;\n  int __pyx_t_3;\n  int __pyx_t_4;\n\n  /* \"View.MemoryView\":1196\n *     cdef int idx\n * \n *     if order == 'F':             # <<<<<<<<<<<<<<\n *         for idx in range(ndim):\n *             strides[idx] = stride\n */\n  __pyx_t_1 = ((__pyx_v_order == 'F') != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":1197\n * \n *     if order == 'F':\n *         for idx in range(ndim):             # <<<<<<<<<<<<<<\n *             strides[idx] = stride\n *             stride *= shape[idx]\n */\n    __pyx_t_2 = __pyx_v_ndim;\n    __pyx_t_3 = __pyx_t_2;\n    for (__pyx_t_4 = 0; __pyx_t_4 < __pyx_t_3; __pyx_t_4+=1) {\n      __pyx_v_idx = __pyx_t_4;\n\n      /* \"View.MemoryView\":1198\n *     if order == 'F':\n *         for idx in range(ndim):\n *             strides[idx] = stride             # <<<<<<<<<<<<<<\n *             stride *= shape[idx]\n *     else:\n */\n      (__pyx_v_strides[__pyx_v_idx]) = __pyx_v_stride;\n\n      /* \"View.MemoryView\":1199\n *         for idx in range(ndim):\n *             strides[idx] = stride\n *             stride *= shape[idx]             # <<<<<<<<<<<<<<\n *     else:\n *         for idx in range(ndim - 1, -1, -1):\n */\n      __pyx_v_stride = (__pyx_v_stride * (__pyx_v_shape[__pyx_v_idx]));\n    }\n\n    /* \"View.MemoryView\":1196\n *     cdef int idx\n * \n *     if order == 'F':             # <<<<<<<<<<<<<<\n *         for idx in range(ndim):\n *             strides[idx] = stride\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":1201\n *             stride *= shape[idx]\n *     else:\n *         for idx in range(ndim - 1, -1, -1):             # <<<<<<<<<<<<<<\n *             strides[idx] = stride\n *             stride *= shape[idx]\n */\n  /*else*/ {\n    for (__pyx_t_2 = (__pyx_v_ndim - 1); __pyx_t_2 > -1; __pyx_t_2-=1) {\n      __pyx_v_idx = __pyx_t_2;\n\n      /* \"View.MemoryView\":1202\n *     else:\n *         for idx in range(ndim - 1, -1, -1):\n *             strides[idx] = stride             # <<<<<<<<<<<<<<\n *             stride *= shape[idx]\n * \n */\n      (__pyx_v_strides[__pyx_v_idx]) = __pyx_v_stride;\n\n      /* \"View.MemoryView\":1203\n *         for idx in range(ndim - 1, -1, -1):\n *             strides[idx] = stride\n *             stride *= shape[idx]             # <<<<<<<<<<<<<<\n * \n *     return stride\n */\n      __pyx_v_stride = (__pyx_v_stride * (__pyx_v_shape[__pyx_v_idx]));\n    }\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":1205\n *             stride *= shape[idx]\n * \n *     return stride             # <<<<<<<<<<<<<<\n * \n * @cname('__pyx_memoryview_copy_data_to_temp')\n */\n  __pyx_r = __pyx_v_stride;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":1187\n * \n * @cname('__pyx_fill_contig_strides_array')\n * cdef Py_ssize_t fill_contig_strides_array(             # <<<<<<<<<<<<<<\n *                 Py_ssize_t *shape, Py_ssize_t *strides, Py_ssize_t stride,\n *                 int ndim, char order) nogil:\n */\n\n  /* function exit code */\n  __pyx_L0:;\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":1208\n * \n * @cname('__pyx_memoryview_copy_data_to_temp')\n * cdef void *copy_data_to_temp(__Pyx_memviewslice *src,             # <<<<<<<<<<<<<<\n *                              __Pyx_memviewslice *tmpslice,\n *                              char order,\n */\n\nstatic void *__pyx_memoryview_copy_data_to_temp(__Pyx_memviewslice *__pyx_v_src, __Pyx_memviewslice *__pyx_v_tmpslice, char __pyx_v_order, int __pyx_v_ndim) {\n  int __pyx_v_i;\n  void *__pyx_v_result;\n  size_t __pyx_v_itemsize;\n  size_t __pyx_v_size;\n  void *__pyx_r;\n  Py_ssize_t __pyx_t_1;\n  int __pyx_t_2;\n  int __pyx_t_3;\n  struct __pyx_memoryview_obj *__pyx_t_4;\n  int __pyx_t_5;\n  int __pyx_t_6;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n\n  /* \"View.MemoryView\":1219\n *     cdef void *result\n * \n *     cdef size_t itemsize = src.memview.view.itemsize             # <<<<<<<<<<<<<<\n *     cdef size_t size = slice_get_size(src, ndim)\n * \n */\n  __pyx_t_1 = __pyx_v_src->memview->view.itemsize;\n  __pyx_v_itemsize = __pyx_t_1;\n\n  /* \"View.MemoryView\":1220\n * \n *     cdef size_t itemsize = src.memview.view.itemsize\n *     cdef size_t size = slice_get_size(src, ndim)             # <<<<<<<<<<<<<<\n * \n *     result = malloc(size)\n */\n  __pyx_v_size = __pyx_memoryview_slice_get_size(__pyx_v_src, __pyx_v_ndim);\n\n  /* \"View.MemoryView\":1222\n *     cdef size_t size = slice_get_size(src, ndim)\n * \n *     result = malloc(size)             # <<<<<<<<<<<<<<\n *     if not result:\n *         _err(MemoryError, NULL)\n */\n  __pyx_v_result = malloc(__pyx_v_size);\n\n  /* \"View.MemoryView\":1223\n * \n *     result = malloc(size)\n *     if not result:             # <<<<<<<<<<<<<<\n *         _err(MemoryError, NULL)\n * \n */\n  __pyx_t_2 = ((!(__pyx_v_result != 0)) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":1224\n *     result = malloc(size)\n *     if not result:\n *         _err(MemoryError, NULL)             # <<<<<<<<<<<<<<\n * \n * \n */\n    __pyx_t_3 = __pyx_memoryview_err(__pyx_builtin_MemoryError, NULL); if (unlikely(__pyx_t_3 == ((int)-1))) __PYX_ERR(2, 1224, __pyx_L1_error)\n\n    /* \"View.MemoryView\":1223\n * \n *     result = malloc(size)\n *     if not result:             # <<<<<<<<<<<<<<\n *         _err(MemoryError, NULL)\n * \n */\n  }\n\n  /* \"View.MemoryView\":1227\n * \n * \n *     tmpslice.data = <char *> result             # <<<<<<<<<<<<<<\n *     tmpslice.memview = src.memview\n *     for i in range(ndim):\n */\n  __pyx_v_tmpslice->data = ((char *)__pyx_v_result);\n\n  /* \"View.MemoryView\":1228\n * \n *     tmpslice.data = <char *> result\n *     tmpslice.memview = src.memview             # <<<<<<<<<<<<<<\n *     for i in range(ndim):\n *         tmpslice.shape[i] = src.shape[i]\n */\n  __pyx_t_4 = __pyx_v_src->memview;\n  __pyx_v_tmpslice->memview = __pyx_t_4;\n\n  /* \"View.MemoryView\":1229\n *     tmpslice.data = <char *> result\n *     tmpslice.memview = src.memview\n *     for i in range(ndim):             # <<<<<<<<<<<<<<\n *         tmpslice.shape[i] = src.shape[i]\n *         tmpslice.suboffsets[i] = -1\n */\n  __pyx_t_3 = __pyx_v_ndim;\n  __pyx_t_5 = __pyx_t_3;\n  for (__pyx_t_6 = 0; __pyx_t_6 < __pyx_t_5; __pyx_t_6+=1) {\n    __pyx_v_i = __pyx_t_6;\n\n    /* \"View.MemoryView\":1230\n *     tmpslice.memview = src.memview\n *     for i in range(ndim):\n *         tmpslice.shape[i] = src.shape[i]             # <<<<<<<<<<<<<<\n *         tmpslice.suboffsets[i] = -1\n * \n */\n    (__pyx_v_tmpslice->shape[__pyx_v_i]) = (__pyx_v_src->shape[__pyx_v_i]);\n\n    /* \"View.MemoryView\":1231\n *     for i in range(ndim):\n *         tmpslice.shape[i] = src.shape[i]\n *         tmpslice.suboffsets[i] = -1             # <<<<<<<<<<<<<<\n * \n *     fill_contig_strides_array(&tmpslice.shape[0], &tmpslice.strides[0], itemsize,\n */\n    (__pyx_v_tmpslice->suboffsets[__pyx_v_i]) = -1L;\n  }\n\n  /* \"View.MemoryView\":1233\n *         tmpslice.suboffsets[i] = -1\n * \n *     fill_contig_strides_array(&tmpslice.shape[0], &tmpslice.strides[0], itemsize,             # <<<<<<<<<<<<<<\n *                               ndim, order)\n * \n */\n  (void)(__pyx_fill_contig_strides_array((&(__pyx_v_tmpslice->shape[0])), (&(__pyx_v_tmpslice->strides[0])), __pyx_v_itemsize, __pyx_v_ndim, __pyx_v_order));\n\n  /* \"View.MemoryView\":1237\n * \n * \n *     for i in range(ndim):             # <<<<<<<<<<<<<<\n *         if tmpslice.shape[i] == 1:\n *             tmpslice.strides[i] = 0\n */\n  __pyx_t_3 = __pyx_v_ndim;\n  __pyx_t_5 = __pyx_t_3;\n  for (__pyx_t_6 = 0; __pyx_t_6 < __pyx_t_5; __pyx_t_6+=1) {\n    __pyx_v_i = __pyx_t_6;\n\n    /* \"View.MemoryView\":1238\n * \n *     for i in range(ndim):\n *         if tmpslice.shape[i] == 1:             # <<<<<<<<<<<<<<\n *             tmpslice.strides[i] = 0\n * \n */\n    __pyx_t_2 = (((__pyx_v_tmpslice->shape[__pyx_v_i]) == 1) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":1239\n *     for i in range(ndim):\n *         if tmpslice.shape[i] == 1:\n *             tmpslice.strides[i] = 0             # <<<<<<<<<<<<<<\n * \n *     if slice_is_contig(src[0], order, ndim):\n */\n      (__pyx_v_tmpslice->strides[__pyx_v_i]) = 0;\n\n      /* \"View.MemoryView\":1238\n * \n *     for i in range(ndim):\n *         if tmpslice.shape[i] == 1:             # <<<<<<<<<<<<<<\n *             tmpslice.strides[i] = 0\n * \n */\n    }\n  }\n\n  /* \"View.MemoryView\":1241\n *             tmpslice.strides[i] = 0\n * \n *     if slice_is_contig(src[0], order, ndim):             # <<<<<<<<<<<<<<\n *         memcpy(result, src.data, size)\n *     else:\n */\n  __pyx_t_2 = (__pyx_memviewslice_is_contig((__pyx_v_src[0]), __pyx_v_order, __pyx_v_ndim) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":1242\n * \n *     if slice_is_contig(src[0], order, ndim):\n *         memcpy(result, src.data, size)             # <<<<<<<<<<<<<<\n *     else:\n *         copy_strided_to_strided(src, tmpslice, ndim, itemsize)\n */\n    (void)(memcpy(__pyx_v_result, __pyx_v_src->data, __pyx_v_size));\n\n    /* \"View.MemoryView\":1241\n *             tmpslice.strides[i] = 0\n * \n *     if slice_is_contig(src[0], order, ndim):             # <<<<<<<<<<<<<<\n *         memcpy(result, src.data, size)\n *     else:\n */\n    goto __pyx_L9;\n  }\n\n  /* \"View.MemoryView\":1244\n *         memcpy(result, src.data, size)\n *     else:\n *         copy_strided_to_strided(src, tmpslice, ndim, itemsize)             # <<<<<<<<<<<<<<\n * \n *     return result\n */\n  /*else*/ {\n    copy_strided_to_strided(__pyx_v_src, __pyx_v_tmpslice, __pyx_v_ndim, __pyx_v_itemsize);\n  }\n  __pyx_L9:;\n\n  /* \"View.MemoryView\":1246\n *         copy_strided_to_strided(src, tmpslice, ndim, itemsize)\n * \n *     return result             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_r = __pyx_v_result;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":1208\n * \n * @cname('__pyx_memoryview_copy_data_to_temp')\n * cdef void *copy_data_to_temp(__Pyx_memviewslice *src,             # <<<<<<<<<<<<<<\n *                              __Pyx_memviewslice *tmpslice,\n *                              char order,\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  {\n    #ifdef WITH_THREAD\n    PyGILState_STATE __pyx_gilstate_save = __Pyx_PyGILState_Ensure();\n    #endif\n    __Pyx_AddTraceback(\"View.MemoryView.copy_data_to_temp\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n    #ifdef WITH_THREAD\n    __Pyx_PyGILState_Release(__pyx_gilstate_save);\n    #endif\n  }\n  __pyx_r = NULL;\n  __pyx_L0:;\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":1251\n * \n * @cname('__pyx_memoryview_err_extents')\n * cdef int _err_extents(int i, Py_ssize_t extent1,             # <<<<<<<<<<<<<<\n *                              Py_ssize_t extent2) except -1 with gil:\n *     raise ValueError(\"got differing extents in dimension %d (got %d and %d)\" %\n */\n\nstatic int __pyx_memoryview_err_extents(int __pyx_v_i, Py_ssize_t __pyx_v_extent1, Py_ssize_t __pyx_v_extent2) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  PyObject *__pyx_t_4 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  #ifdef WITH_THREAD\n  PyGILState_STATE __pyx_gilstate_save = __Pyx_PyGILState_Ensure();\n  #endif\n  __Pyx_RefNannySetupContext(\"_err_extents\", 0);\n\n  /* \"View.MemoryView\":1254\n *                              Py_ssize_t extent2) except -1 with gil:\n *     raise ValueError(\"got differing extents in dimension %d (got %d and %d)\" %\n *                                                         (i, extent1, extent2))             # <<<<<<<<<<<<<<\n * \n * @cname('__pyx_memoryview_err_dim')\n */\n  __pyx_t_1 = __Pyx_PyInt_From_int(__pyx_v_i); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 1254, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_t_2 = PyInt_FromSsize_t(__pyx_v_extent1); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 1254, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __pyx_t_3 = PyInt_FromSsize_t(__pyx_v_extent2); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 1254, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  __pyx_t_4 = PyTuple_New(3); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 1254, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_4);\n  __Pyx_GIVEREF(__pyx_t_1);\n  PyTuple_SET_ITEM(__pyx_t_4, 0, __pyx_t_1);\n  __Pyx_GIVEREF(__pyx_t_2);\n  PyTuple_SET_ITEM(__pyx_t_4, 1, __pyx_t_2);\n  __Pyx_GIVEREF(__pyx_t_3);\n  PyTuple_SET_ITEM(__pyx_t_4, 2, __pyx_t_3);\n  __pyx_t_1 = 0;\n  __pyx_t_2 = 0;\n  __pyx_t_3 = 0;\n\n  /* \"View.MemoryView\":1253\n * cdef int _err_extents(int i, Py_ssize_t extent1,\n *                              Py_ssize_t extent2) except -1 with gil:\n *     raise ValueError(\"got differing extents in dimension %d (got %d and %d)\" %             # <<<<<<<<<<<<<<\n *                                                         (i, extent1, extent2))\n * \n */\n  __pyx_t_3 = __Pyx_PyString_Format(__pyx_kp_s_got_differing_extents_in_dimensi, __pyx_t_4); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 1253, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;\n  __pyx_t_4 = __Pyx_PyObject_CallOneArg(__pyx_builtin_ValueError, __pyx_t_3); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 1253, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_4);\n  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n  __Pyx_Raise(__pyx_t_4, 0, 0, 0);\n  __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;\n  __PYX_ERR(2, 1253, __pyx_L1_error)\n\n  /* \"View.MemoryView\":1251\n * \n * @cname('__pyx_memoryview_err_extents')\n * cdef int _err_extents(int i, Py_ssize_t extent1,             # <<<<<<<<<<<<<<\n *                              Py_ssize_t extent2) except -1 with gil:\n *     raise ValueError(\"got differing extents in dimension %d (got %d and %d)\" %\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_XDECREF(__pyx_t_4);\n  __Pyx_AddTraceback(\"View.MemoryView._err_extents\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = -1;\n  __Pyx_RefNannyFinishContext();\n  #ifdef WITH_THREAD\n  __Pyx_PyGILState_Release(__pyx_gilstate_save);\n  #endif\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":1257\n * \n * @cname('__pyx_memoryview_err_dim')\n * cdef int _err_dim(object error, char *msg, int dim) except -1 with gil:             # <<<<<<<<<<<<<<\n *     raise error(msg.decode('ascii') % dim)\n * \n */\n\nstatic int __pyx_memoryview_err_dim(PyObject *__pyx_v_error, char *__pyx_v_msg, int __pyx_v_dim) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  PyObject *__pyx_t_4 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  #ifdef WITH_THREAD\n  PyGILState_STATE __pyx_gilstate_save = __Pyx_PyGILState_Ensure();\n  #endif\n  __Pyx_RefNannySetupContext(\"_err_dim\", 0);\n  __Pyx_INCREF(__pyx_v_error);\n\n  /* \"View.MemoryView\":1258\n * @cname('__pyx_memoryview_err_dim')\n * cdef int _err_dim(object error, char *msg, int dim) except -1 with gil:\n *     raise error(msg.decode('ascii') % dim)             # <<<<<<<<<<<<<<\n * \n * @cname('__pyx_memoryview_err')\n */\n  __pyx_t_2 = __Pyx_decode_c_string(__pyx_v_msg, 0, strlen(__pyx_v_msg), NULL, NULL, PyUnicode_DecodeASCII); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 1258, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __pyx_t_3 = __Pyx_PyInt_From_int(__pyx_v_dim); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 1258, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  __pyx_t_4 = PyUnicode_Format(__pyx_t_2, __pyx_t_3); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 1258, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_4);\n  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n  __Pyx_INCREF(__pyx_v_error);\n  __pyx_t_3 = __pyx_v_error; __pyx_t_2 = NULL;\n  if (CYTHON_UNPACK_METHODS && unlikely(PyMethod_Check(__pyx_t_3))) {\n    __pyx_t_2 = PyMethod_GET_SELF(__pyx_t_3);\n    if (likely(__pyx_t_2)) {\n      PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_3);\n      __Pyx_INCREF(__pyx_t_2);\n      __Pyx_INCREF(function);\n      __Pyx_DECREF_SET(__pyx_t_3, function);\n    }\n  }\n  __pyx_t_1 = (__pyx_t_2) ? __Pyx_PyObject_Call2Args(__pyx_t_3, __pyx_t_2, __pyx_t_4) : __Pyx_PyObject_CallOneArg(__pyx_t_3, __pyx_t_4);\n  __Pyx_XDECREF(__pyx_t_2); __pyx_t_2 = 0;\n  __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;\n  if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 1258, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n  __Pyx_Raise(__pyx_t_1, 0, 0, 0);\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __PYX_ERR(2, 1258, __pyx_L1_error)\n\n  /* \"View.MemoryView\":1257\n * \n * @cname('__pyx_memoryview_err_dim')\n * cdef int _err_dim(object error, char *msg, int dim) except -1 with gil:             # <<<<<<<<<<<<<<\n *     raise error(msg.decode('ascii') % dim)\n * \n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_XDECREF(__pyx_t_4);\n  __Pyx_AddTraceback(\"View.MemoryView._err_dim\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = -1;\n  __Pyx_XDECREF(__pyx_v_error);\n  __Pyx_RefNannyFinishContext();\n  #ifdef WITH_THREAD\n  __Pyx_PyGILState_Release(__pyx_gilstate_save);\n  #endif\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":1261\n * \n * @cname('__pyx_memoryview_err')\n * cdef int _err(object error, char *msg) except -1 with gil:             # <<<<<<<<<<<<<<\n *     if msg != NULL:\n *         raise error(msg.decode('ascii'))\n */\n\nstatic int __pyx_memoryview_err(PyObject *__pyx_v_error, char *__pyx_v_msg) {\n  int __pyx_r;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  PyObject *__pyx_t_4 = NULL;\n  PyObject *__pyx_t_5 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  #ifdef WITH_THREAD\n  PyGILState_STATE __pyx_gilstate_save = __Pyx_PyGILState_Ensure();\n  #endif\n  __Pyx_RefNannySetupContext(\"_err\", 0);\n  __Pyx_INCREF(__pyx_v_error);\n\n  /* \"View.MemoryView\":1262\n * @cname('__pyx_memoryview_err')\n * cdef int _err(object error, char *msg) except -1 with gil:\n *     if msg != NULL:             # <<<<<<<<<<<<<<\n *         raise error(msg.decode('ascii'))\n *     else:\n */\n  __pyx_t_1 = ((__pyx_v_msg != NULL) != 0);\n  if (unlikely(__pyx_t_1)) {\n\n    /* \"View.MemoryView\":1263\n * cdef int _err(object error, char *msg) except -1 with gil:\n *     if msg != NULL:\n *         raise error(msg.decode('ascii'))             # <<<<<<<<<<<<<<\n *     else:\n *         raise error\n */\n    __pyx_t_3 = __Pyx_decode_c_string(__pyx_v_msg, 0, strlen(__pyx_v_msg), NULL, NULL, PyUnicode_DecodeASCII); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 1263, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_INCREF(__pyx_v_error);\n    __pyx_t_4 = __pyx_v_error; __pyx_t_5 = NULL;\n    if (CYTHON_UNPACK_METHODS && unlikely(PyMethod_Check(__pyx_t_4))) {\n      __pyx_t_5 = PyMethod_GET_SELF(__pyx_t_4);\n      if (likely(__pyx_t_5)) {\n        PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_4);\n        __Pyx_INCREF(__pyx_t_5);\n        __Pyx_INCREF(function);\n        __Pyx_DECREF_SET(__pyx_t_4, function);\n      }\n    }\n    __pyx_t_2 = (__pyx_t_5) ? __Pyx_PyObject_Call2Args(__pyx_t_4, __pyx_t_5, __pyx_t_3) : __Pyx_PyObject_CallOneArg(__pyx_t_4, __pyx_t_3);\n    __Pyx_XDECREF(__pyx_t_5); __pyx_t_5 = 0;\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n    if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 1263, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;\n    __Pyx_Raise(__pyx_t_2, 0, 0, 0);\n    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n    __PYX_ERR(2, 1263, __pyx_L1_error)\n\n    /* \"View.MemoryView\":1262\n * @cname('__pyx_memoryview_err')\n * cdef int _err(object error, char *msg) except -1 with gil:\n *     if msg != NULL:             # <<<<<<<<<<<<<<\n *         raise error(msg.decode('ascii'))\n *     else:\n */\n  }\n\n  /* \"View.MemoryView\":1265\n *         raise error(msg.decode('ascii'))\n *     else:\n *         raise error             # <<<<<<<<<<<<<<\n * \n * @cname('__pyx_memoryview_copy_contents')\n */\n  /*else*/ {\n    __Pyx_Raise(__pyx_v_error, 0, 0, 0);\n    __PYX_ERR(2, 1265, __pyx_L1_error)\n  }\n\n  /* \"View.MemoryView\":1261\n * \n * @cname('__pyx_memoryview_err')\n * cdef int _err(object error, char *msg) except -1 with gil:             # <<<<<<<<<<<<<<\n *     if msg != NULL:\n *         raise error(msg.decode('ascii'))\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_XDECREF(__pyx_t_4);\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_AddTraceback(\"View.MemoryView._err\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = -1;\n  __Pyx_XDECREF(__pyx_v_error);\n  __Pyx_RefNannyFinishContext();\n  #ifdef WITH_THREAD\n  __Pyx_PyGILState_Release(__pyx_gilstate_save);\n  #endif\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":1268\n * \n * @cname('__pyx_memoryview_copy_contents')\n * cdef int memoryview_copy_contents(__Pyx_memviewslice src,             # <<<<<<<<<<<<<<\n *                                   __Pyx_memviewslice dst,\n *                                   int src_ndim, int dst_ndim,\n */\n\nstatic int __pyx_memoryview_copy_contents(__Pyx_memviewslice __pyx_v_src, __Pyx_memviewslice __pyx_v_dst, int __pyx_v_src_ndim, int __pyx_v_dst_ndim, int __pyx_v_dtype_is_object) {\n  void *__pyx_v_tmpdata;\n  size_t __pyx_v_itemsize;\n  int __pyx_v_i;\n  char __pyx_v_order;\n  int __pyx_v_broadcasting;\n  int __pyx_v_direct_copy;\n  __Pyx_memviewslice __pyx_v_tmp;\n  int __pyx_v_ndim;\n  int __pyx_r;\n  Py_ssize_t __pyx_t_1;\n  int __pyx_t_2;\n  int __pyx_t_3;\n  int __pyx_t_4;\n  int __pyx_t_5;\n  int __pyx_t_6;\n  void *__pyx_t_7;\n  int __pyx_t_8;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n\n  /* \"View.MemoryView\":1276\n *     Check for overlapping memory and verify the shapes.\n *     \"\"\"\n *     cdef void *tmpdata = NULL             # <<<<<<<<<<<<<<\n *     cdef size_t itemsize = src.memview.view.itemsize\n *     cdef int i\n */\n  __pyx_v_tmpdata = NULL;\n\n  /* \"View.MemoryView\":1277\n *     \"\"\"\n *     cdef void *tmpdata = NULL\n *     cdef size_t itemsize = src.memview.view.itemsize             # <<<<<<<<<<<<<<\n *     cdef int i\n *     cdef char order = get_best_order(&src, src_ndim)\n */\n  __pyx_t_1 = __pyx_v_src.memview->view.itemsize;\n  __pyx_v_itemsize = __pyx_t_1;\n\n  /* \"View.MemoryView\":1279\n *     cdef size_t itemsize = src.memview.view.itemsize\n *     cdef int i\n *     cdef char order = get_best_order(&src, src_ndim)             # <<<<<<<<<<<<<<\n *     cdef bint broadcasting = False\n *     cdef bint direct_copy = False\n */\n  __pyx_v_order = __pyx_get_best_slice_order((&__pyx_v_src), __pyx_v_src_ndim);\n\n  /* \"View.MemoryView\":1280\n *     cdef int i\n *     cdef char order = get_best_order(&src, src_ndim)\n *     cdef bint broadcasting = False             # <<<<<<<<<<<<<<\n *     cdef bint direct_copy = False\n *     cdef __Pyx_memviewslice tmp\n */\n  __pyx_v_broadcasting = 0;\n\n  /* \"View.MemoryView\":1281\n *     cdef char order = get_best_order(&src, src_ndim)\n *     cdef bint broadcasting = False\n *     cdef bint direct_copy = False             # <<<<<<<<<<<<<<\n *     cdef __Pyx_memviewslice tmp\n * \n */\n  __pyx_v_direct_copy = 0;\n\n  /* \"View.MemoryView\":1284\n *     cdef __Pyx_memviewslice tmp\n * \n *     if src_ndim < dst_ndim:             # <<<<<<<<<<<<<<\n *         broadcast_leading(&src, src_ndim, dst_ndim)\n *     elif dst_ndim < src_ndim:\n */\n  __pyx_t_2 = ((__pyx_v_src_ndim < __pyx_v_dst_ndim) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":1285\n * \n *     if src_ndim < dst_ndim:\n *         broadcast_leading(&src, src_ndim, dst_ndim)             # <<<<<<<<<<<<<<\n *     elif dst_ndim < src_ndim:\n *         broadcast_leading(&dst, dst_ndim, src_ndim)\n */\n    __pyx_memoryview_broadcast_leading((&__pyx_v_src), __pyx_v_src_ndim, __pyx_v_dst_ndim);\n\n    /* \"View.MemoryView\":1284\n *     cdef __Pyx_memviewslice tmp\n * \n *     if src_ndim < dst_ndim:             # <<<<<<<<<<<<<<\n *         broadcast_leading(&src, src_ndim, dst_ndim)\n *     elif dst_ndim < src_ndim:\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":1286\n *     if src_ndim < dst_ndim:\n *         broadcast_leading(&src, src_ndim, dst_ndim)\n *     elif dst_ndim < src_ndim:             # <<<<<<<<<<<<<<\n *         broadcast_leading(&dst, dst_ndim, src_ndim)\n * \n */\n  __pyx_t_2 = ((__pyx_v_dst_ndim < __pyx_v_src_ndim) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":1287\n *         broadcast_leading(&src, src_ndim, dst_ndim)\n *     elif dst_ndim < src_ndim:\n *         broadcast_leading(&dst, dst_ndim, src_ndim)             # <<<<<<<<<<<<<<\n * \n *     cdef int ndim = max(src_ndim, dst_ndim)\n */\n    __pyx_memoryview_broadcast_leading((&__pyx_v_dst), __pyx_v_dst_ndim, __pyx_v_src_ndim);\n\n    /* \"View.MemoryView\":1286\n *     if src_ndim < dst_ndim:\n *         broadcast_leading(&src, src_ndim, dst_ndim)\n *     elif dst_ndim < src_ndim:             # <<<<<<<<<<<<<<\n *         broadcast_leading(&dst, dst_ndim, src_ndim)\n * \n */\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":1289\n *         broadcast_leading(&dst, dst_ndim, src_ndim)\n * \n *     cdef int ndim = max(src_ndim, dst_ndim)             # <<<<<<<<<<<<<<\n * \n *     for i in range(ndim):\n */\n  __pyx_t_3 = __pyx_v_dst_ndim;\n  __pyx_t_4 = __pyx_v_src_ndim;\n  if (((__pyx_t_3 > __pyx_t_4) != 0)) {\n    __pyx_t_5 = __pyx_t_3;\n  } else {\n    __pyx_t_5 = __pyx_t_4;\n  }\n  __pyx_v_ndim = __pyx_t_5;\n\n  /* \"View.MemoryView\":1291\n *     cdef int ndim = max(src_ndim, dst_ndim)\n * \n *     for i in range(ndim):             # <<<<<<<<<<<<<<\n *         if src.shape[i] != dst.shape[i]:\n *             if src.shape[i] == 1:\n */\n  __pyx_t_5 = __pyx_v_ndim;\n  __pyx_t_3 = __pyx_t_5;\n  for (__pyx_t_4 = 0; __pyx_t_4 < __pyx_t_3; __pyx_t_4+=1) {\n    __pyx_v_i = __pyx_t_4;\n\n    /* \"View.MemoryView\":1292\n * \n *     for i in range(ndim):\n *         if src.shape[i] != dst.shape[i]:             # <<<<<<<<<<<<<<\n *             if src.shape[i] == 1:\n *                 broadcasting = True\n */\n    __pyx_t_2 = (((__pyx_v_src.shape[__pyx_v_i]) != (__pyx_v_dst.shape[__pyx_v_i])) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":1293\n *     for i in range(ndim):\n *         if src.shape[i] != dst.shape[i]:\n *             if src.shape[i] == 1:             # <<<<<<<<<<<<<<\n *                 broadcasting = True\n *                 src.strides[i] = 0\n */\n      __pyx_t_2 = (((__pyx_v_src.shape[__pyx_v_i]) == 1) != 0);\n      if (__pyx_t_2) {\n\n        /* \"View.MemoryView\":1294\n *         if src.shape[i] != dst.shape[i]:\n *             if src.shape[i] == 1:\n *                 broadcasting = True             # <<<<<<<<<<<<<<\n *                 src.strides[i] = 0\n *             else:\n */\n        __pyx_v_broadcasting = 1;\n\n        /* \"View.MemoryView\":1295\n *             if src.shape[i] == 1:\n *                 broadcasting = True\n *                 src.strides[i] = 0             # <<<<<<<<<<<<<<\n *             else:\n *                 _err_extents(i, dst.shape[i], src.shape[i])\n */\n        (__pyx_v_src.strides[__pyx_v_i]) = 0;\n\n        /* \"View.MemoryView\":1293\n *     for i in range(ndim):\n *         if src.shape[i] != dst.shape[i]:\n *             if src.shape[i] == 1:             # <<<<<<<<<<<<<<\n *                 broadcasting = True\n *                 src.strides[i] = 0\n */\n        goto __pyx_L7;\n      }\n\n      /* \"View.MemoryView\":1297\n *                 src.strides[i] = 0\n *             else:\n *                 _err_extents(i, dst.shape[i], src.shape[i])             # <<<<<<<<<<<<<<\n * \n *         if src.suboffsets[i] >= 0:\n */\n      /*else*/ {\n        __pyx_t_6 = __pyx_memoryview_err_extents(__pyx_v_i, (__pyx_v_dst.shape[__pyx_v_i]), (__pyx_v_src.shape[__pyx_v_i])); if (unlikely(__pyx_t_6 == ((int)-1))) __PYX_ERR(2, 1297, __pyx_L1_error)\n      }\n      __pyx_L7:;\n\n      /* \"View.MemoryView\":1292\n * \n *     for i in range(ndim):\n *         if src.shape[i] != dst.shape[i]:             # <<<<<<<<<<<<<<\n *             if src.shape[i] == 1:\n *                 broadcasting = True\n */\n    }\n\n    /* \"View.MemoryView\":1299\n *                 _err_extents(i, dst.shape[i], src.shape[i])\n * \n *         if src.suboffsets[i] >= 0:             # <<<<<<<<<<<<<<\n *             _err_dim(ValueError, \"Dimension %d is not direct\", i)\n * \n */\n    __pyx_t_2 = (((__pyx_v_src.suboffsets[__pyx_v_i]) >= 0) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":1300\n * \n *         if src.suboffsets[i] >= 0:\n *             _err_dim(ValueError, \"Dimension %d is not direct\", i)             # <<<<<<<<<<<<<<\n * \n *     if slices_overlap(&src, &dst, ndim, itemsize):\n */\n      __pyx_t_6 = __pyx_memoryview_err_dim(__pyx_builtin_ValueError, ((char *)\"Dimension %d is not direct\"), __pyx_v_i); if (unlikely(__pyx_t_6 == ((int)-1))) __PYX_ERR(2, 1300, __pyx_L1_error)\n\n      /* \"View.MemoryView\":1299\n *                 _err_extents(i, dst.shape[i], src.shape[i])\n * \n *         if src.suboffsets[i] >= 0:             # <<<<<<<<<<<<<<\n *             _err_dim(ValueError, \"Dimension %d is not direct\", i)\n * \n */\n    }\n  }\n\n  /* \"View.MemoryView\":1302\n *             _err_dim(ValueError, \"Dimension %d is not direct\", i)\n * \n *     if slices_overlap(&src, &dst, ndim, itemsize):             # <<<<<<<<<<<<<<\n * \n *         if not slice_is_contig(src, order, ndim):\n */\n  __pyx_t_2 = (__pyx_slices_overlap((&__pyx_v_src), (&__pyx_v_dst), __pyx_v_ndim, __pyx_v_itemsize) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":1304\n *     if slices_overlap(&src, &dst, ndim, itemsize):\n * \n *         if not slice_is_contig(src, order, ndim):             # <<<<<<<<<<<<<<\n *             order = get_best_order(&dst, ndim)\n * \n */\n    __pyx_t_2 = ((!(__pyx_memviewslice_is_contig(__pyx_v_src, __pyx_v_order, __pyx_v_ndim) != 0)) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":1305\n * \n *         if not slice_is_contig(src, order, ndim):\n *             order = get_best_order(&dst, ndim)             # <<<<<<<<<<<<<<\n * \n *         tmpdata = copy_data_to_temp(&src, &tmp, order, ndim)\n */\n      __pyx_v_order = __pyx_get_best_slice_order((&__pyx_v_dst), __pyx_v_ndim);\n\n      /* \"View.MemoryView\":1304\n *     if slices_overlap(&src, &dst, ndim, itemsize):\n * \n *         if not slice_is_contig(src, order, ndim):             # <<<<<<<<<<<<<<\n *             order = get_best_order(&dst, ndim)\n * \n */\n    }\n\n    /* \"View.MemoryView\":1307\n *             order = get_best_order(&dst, ndim)\n * \n *         tmpdata = copy_data_to_temp(&src, &tmp, order, ndim)             # <<<<<<<<<<<<<<\n *         src = tmp\n * \n */\n    __pyx_t_7 = __pyx_memoryview_copy_data_to_temp((&__pyx_v_src), (&__pyx_v_tmp), __pyx_v_order, __pyx_v_ndim); if (unlikely(__pyx_t_7 == ((void *)NULL))) __PYX_ERR(2, 1307, __pyx_L1_error)\n    __pyx_v_tmpdata = __pyx_t_7;\n\n    /* \"View.MemoryView\":1308\n * \n *         tmpdata = copy_data_to_temp(&src, &tmp, order, ndim)\n *         src = tmp             # <<<<<<<<<<<<<<\n * \n *     if not broadcasting:\n */\n    __pyx_v_src = __pyx_v_tmp;\n\n    /* \"View.MemoryView\":1302\n *             _err_dim(ValueError, \"Dimension %d is not direct\", i)\n * \n *     if slices_overlap(&src, &dst, ndim, itemsize):             # <<<<<<<<<<<<<<\n * \n *         if not slice_is_contig(src, order, ndim):\n */\n  }\n\n  /* \"View.MemoryView\":1310\n *         src = tmp\n * \n *     if not broadcasting:             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_t_2 = ((!(__pyx_v_broadcasting != 0)) != 0);\n  if (__pyx_t_2) {\n\n    /* \"View.MemoryView\":1313\n * \n * \n *         if slice_is_contig(src, 'C', ndim):             # <<<<<<<<<<<<<<\n *             direct_copy = slice_is_contig(dst, 'C', ndim)\n *         elif slice_is_contig(src, 'F', ndim):\n */\n    __pyx_t_2 = (__pyx_memviewslice_is_contig(__pyx_v_src, 'C', __pyx_v_ndim) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":1314\n * \n *         if slice_is_contig(src, 'C', ndim):\n *             direct_copy = slice_is_contig(dst, 'C', ndim)             # <<<<<<<<<<<<<<\n *         elif slice_is_contig(src, 'F', ndim):\n *             direct_copy = slice_is_contig(dst, 'F', ndim)\n */\n      __pyx_v_direct_copy = __pyx_memviewslice_is_contig(__pyx_v_dst, 'C', __pyx_v_ndim);\n\n      /* \"View.MemoryView\":1313\n * \n * \n *         if slice_is_contig(src, 'C', ndim):             # <<<<<<<<<<<<<<\n *             direct_copy = slice_is_contig(dst, 'C', ndim)\n *         elif slice_is_contig(src, 'F', ndim):\n */\n      goto __pyx_L12;\n    }\n\n    /* \"View.MemoryView\":1315\n *         if slice_is_contig(src, 'C', ndim):\n *             direct_copy = slice_is_contig(dst, 'C', ndim)\n *         elif slice_is_contig(src, 'F', ndim):             # <<<<<<<<<<<<<<\n *             direct_copy = slice_is_contig(dst, 'F', ndim)\n * \n */\n    __pyx_t_2 = (__pyx_memviewslice_is_contig(__pyx_v_src, 'F', __pyx_v_ndim) != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":1316\n *             direct_copy = slice_is_contig(dst, 'C', ndim)\n *         elif slice_is_contig(src, 'F', ndim):\n *             direct_copy = slice_is_contig(dst, 'F', ndim)             # <<<<<<<<<<<<<<\n * \n *         if direct_copy:\n */\n      __pyx_v_direct_copy = __pyx_memviewslice_is_contig(__pyx_v_dst, 'F', __pyx_v_ndim);\n\n      /* \"View.MemoryView\":1315\n *         if slice_is_contig(src, 'C', ndim):\n *             direct_copy = slice_is_contig(dst, 'C', ndim)\n *         elif slice_is_contig(src, 'F', ndim):             # <<<<<<<<<<<<<<\n *             direct_copy = slice_is_contig(dst, 'F', ndim)\n * \n */\n    }\n    __pyx_L12:;\n\n    /* \"View.MemoryView\":1318\n *             direct_copy = slice_is_contig(dst, 'F', ndim)\n * \n *         if direct_copy:             # <<<<<<<<<<<<<<\n * \n *             refcount_copying(&dst, dtype_is_object, ndim, False)\n */\n    __pyx_t_2 = (__pyx_v_direct_copy != 0);\n    if (__pyx_t_2) {\n\n      /* \"View.MemoryView\":1320\n *         if direct_copy:\n * \n *             refcount_copying(&dst, dtype_is_object, ndim, False)             # <<<<<<<<<<<<<<\n *             memcpy(dst.data, src.data, slice_get_size(&src, ndim))\n *             refcount_copying(&dst, dtype_is_object, ndim, True)\n */\n      __pyx_memoryview_refcount_copying((&__pyx_v_dst), __pyx_v_dtype_is_object, __pyx_v_ndim, 0);\n\n      /* \"View.MemoryView\":1321\n * \n *             refcount_copying(&dst, dtype_is_object, ndim, False)\n *             memcpy(dst.data, src.data, slice_get_size(&src, ndim))             # <<<<<<<<<<<<<<\n *             refcount_copying(&dst, dtype_is_object, ndim, True)\n *             free(tmpdata)\n */\n      (void)(memcpy(__pyx_v_dst.data, __pyx_v_src.data, __pyx_memoryview_slice_get_size((&__pyx_v_src), __pyx_v_ndim)));\n\n      /* \"View.MemoryView\":1322\n *             refcount_copying(&dst, dtype_is_object, ndim, False)\n *             memcpy(dst.data, src.data, slice_get_size(&src, ndim))\n *             refcount_copying(&dst, dtype_is_object, ndim, True)             # <<<<<<<<<<<<<<\n *             free(tmpdata)\n *             return 0\n */\n      __pyx_memoryview_refcount_copying((&__pyx_v_dst), __pyx_v_dtype_is_object, __pyx_v_ndim, 1);\n\n      /* \"View.MemoryView\":1323\n *             memcpy(dst.data, src.data, slice_get_size(&src, ndim))\n *             refcount_copying(&dst, dtype_is_object, ndim, True)\n *             free(tmpdata)             # <<<<<<<<<<<<<<\n *             return 0\n * \n */\n      free(__pyx_v_tmpdata);\n\n      /* \"View.MemoryView\":1324\n *             refcount_copying(&dst, dtype_is_object, ndim, True)\n *             free(tmpdata)\n *             return 0             # <<<<<<<<<<<<<<\n * \n *     if order == 'F' == get_best_order(&dst, ndim):\n */\n      __pyx_r = 0;\n      goto __pyx_L0;\n\n      /* \"View.MemoryView\":1318\n *             direct_copy = slice_is_contig(dst, 'F', ndim)\n * \n *         if direct_copy:             # <<<<<<<<<<<<<<\n * \n *             refcount_copying(&dst, dtype_is_object, ndim, False)\n */\n    }\n\n    /* \"View.MemoryView\":1310\n *         src = tmp\n * \n *     if not broadcasting:             # <<<<<<<<<<<<<<\n * \n * \n */\n  }\n\n  /* \"View.MemoryView\":1326\n *             return 0\n * \n *     if order == 'F' == get_best_order(&dst, ndim):             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_t_2 = (__pyx_v_order == 'F');\n  if (__pyx_t_2) {\n    __pyx_t_2 = ('F' == __pyx_get_best_slice_order((&__pyx_v_dst), __pyx_v_ndim));\n  }\n  __pyx_t_8 = (__pyx_t_2 != 0);\n  if (__pyx_t_8) {\n\n    /* \"View.MemoryView\":1329\n * \n * \n *         transpose_memslice(&src)             # <<<<<<<<<<<<<<\n *         transpose_memslice(&dst)\n * \n */\n    __pyx_t_5 = __pyx_memslice_transpose((&__pyx_v_src)); if (unlikely(__pyx_t_5 == ((int)0))) __PYX_ERR(2, 1329, __pyx_L1_error)\n\n    /* \"View.MemoryView\":1330\n * \n *         transpose_memslice(&src)\n *         transpose_memslice(&dst)             # <<<<<<<<<<<<<<\n * \n *     refcount_copying(&dst, dtype_is_object, ndim, False)\n */\n    __pyx_t_5 = __pyx_memslice_transpose((&__pyx_v_dst)); if (unlikely(__pyx_t_5 == ((int)0))) __PYX_ERR(2, 1330, __pyx_L1_error)\n\n    /* \"View.MemoryView\":1326\n *             return 0\n * \n *     if order == 'F' == get_best_order(&dst, ndim):             # <<<<<<<<<<<<<<\n * \n * \n */\n  }\n\n  /* \"View.MemoryView\":1332\n *         transpose_memslice(&dst)\n * \n *     refcount_copying(&dst, dtype_is_object, ndim, False)             # <<<<<<<<<<<<<<\n *     copy_strided_to_strided(&src, &dst, ndim, itemsize)\n *     refcount_copying(&dst, dtype_is_object, ndim, True)\n */\n  __pyx_memoryview_refcount_copying((&__pyx_v_dst), __pyx_v_dtype_is_object, __pyx_v_ndim, 0);\n\n  /* \"View.MemoryView\":1333\n * \n *     refcount_copying(&dst, dtype_is_object, ndim, False)\n *     copy_strided_to_strided(&src, &dst, ndim, itemsize)             # <<<<<<<<<<<<<<\n *     refcount_copying(&dst, dtype_is_object, ndim, True)\n * \n */\n  copy_strided_to_strided((&__pyx_v_src), (&__pyx_v_dst), __pyx_v_ndim, __pyx_v_itemsize);\n\n  /* \"View.MemoryView\":1334\n *     refcount_copying(&dst, dtype_is_object, ndim, False)\n *     copy_strided_to_strided(&src, &dst, ndim, itemsize)\n *     refcount_copying(&dst, dtype_is_object, ndim, True)             # <<<<<<<<<<<<<<\n * \n *     free(tmpdata)\n */\n  __pyx_memoryview_refcount_copying((&__pyx_v_dst), __pyx_v_dtype_is_object, __pyx_v_ndim, 1);\n\n  /* \"View.MemoryView\":1336\n *     refcount_copying(&dst, dtype_is_object, ndim, True)\n * \n *     free(tmpdata)             # <<<<<<<<<<<<<<\n *     return 0\n * \n */\n  free(__pyx_v_tmpdata);\n\n  /* \"View.MemoryView\":1337\n * \n *     free(tmpdata)\n *     return 0             # <<<<<<<<<<<<<<\n * \n * @cname('__pyx_memoryview_broadcast_leading')\n */\n  __pyx_r = 0;\n  goto __pyx_L0;\n\n  /* \"View.MemoryView\":1268\n * \n * @cname('__pyx_memoryview_copy_contents')\n * cdef int memoryview_copy_contents(__Pyx_memviewslice src,             # <<<<<<<<<<<<<<\n *                                   __Pyx_memviewslice dst,\n *                                   int src_ndim, int dst_ndim,\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  {\n    #ifdef WITH_THREAD\n    PyGILState_STATE __pyx_gilstate_save = __Pyx_PyGILState_Ensure();\n    #endif\n    __Pyx_AddTraceback(\"View.MemoryView.memoryview_copy_contents\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n    #ifdef WITH_THREAD\n    __Pyx_PyGILState_Release(__pyx_gilstate_save);\n    #endif\n  }\n  __pyx_r = -1;\n  __pyx_L0:;\n  return __pyx_r;\n}\n\n/* \"View.MemoryView\":1340\n * \n * @cname('__pyx_memoryview_broadcast_leading')\n * cdef void broadcast_leading(__Pyx_memviewslice *mslice,             # <<<<<<<<<<<<<<\n *                             int ndim,\n *                             int ndim_other) nogil:\n */\n\nstatic void __pyx_memoryview_broadcast_leading(__Pyx_memviewslice *__pyx_v_mslice, int __pyx_v_ndim, int __pyx_v_ndim_other) {\n  int __pyx_v_i;\n  int __pyx_v_offset;\n  int __pyx_t_1;\n  int __pyx_t_2;\n  int __pyx_t_3;\n\n  /* \"View.MemoryView\":1344\n *                             int ndim_other) nogil:\n *     cdef int i\n *     cdef int offset = ndim_other - ndim             # <<<<<<<<<<<<<<\n * \n *     for i in range(ndim - 1, -1, -1):\n */\n  __pyx_v_offset = (__pyx_v_ndim_other - __pyx_v_ndim);\n\n  /* \"View.MemoryView\":1346\n *     cdef int offset = ndim_other - ndim\n * \n *     for i in range(ndim - 1, -1, -1):             # <<<<<<<<<<<<<<\n *         mslice.shape[i + offset] = mslice.shape[i]\n *         mslice.strides[i + offset] = mslice.strides[i]\n */\n  for (__pyx_t_1 = (__pyx_v_ndim - 1); __pyx_t_1 > -1; __pyx_t_1-=1) {\n    __pyx_v_i = __pyx_t_1;\n\n    /* \"View.MemoryView\":1347\n * \n *     for i in range(ndim - 1, -1, -1):\n *         mslice.shape[i + offset] = mslice.shape[i]             # <<<<<<<<<<<<<<\n *         mslice.strides[i + offset] = mslice.strides[i]\n *         mslice.suboffsets[i + offset] = mslice.suboffsets[i]\n */\n    (__pyx_v_mslice->shape[(__pyx_v_i + __pyx_v_offset)]) = (__pyx_v_mslice->shape[__pyx_v_i]);\n\n    /* \"View.MemoryView\":1348\n *     for i in range(ndim - 1, -1, -1):\n *         mslice.shape[i + offset] = mslice.shape[i]\n *         mslice.strides[i + offset] = mslice.strides[i]             # <<<<<<<<<<<<<<\n *         mslice.suboffsets[i + offset] = mslice.suboffsets[i]\n * \n */\n    (__pyx_v_mslice->strides[(__pyx_v_i + __pyx_v_offset)]) = (__pyx_v_mslice->strides[__pyx_v_i]);\n\n    /* \"View.MemoryView\":1349\n *         mslice.shape[i + offset] = mslice.shape[i]\n *         mslice.strides[i + offset] = mslice.strides[i]\n *         mslice.suboffsets[i + offset] = mslice.suboffsets[i]             # <<<<<<<<<<<<<<\n * \n *     for i in range(offset):\n */\n    (__pyx_v_mslice->suboffsets[(__pyx_v_i + __pyx_v_offset)]) = (__pyx_v_mslice->suboffsets[__pyx_v_i]);\n  }\n\n  /* \"View.MemoryView\":1351\n *         mslice.suboffsets[i + offset] = mslice.suboffsets[i]\n * \n *     for i in range(offset):             # <<<<<<<<<<<<<<\n *         mslice.shape[i] = 1\n *         mslice.strides[i] = mslice.strides[0]\n */\n  __pyx_t_1 = __pyx_v_offset;\n  __pyx_t_2 = __pyx_t_1;\n  for (__pyx_t_3 = 0; __pyx_t_3 < __pyx_t_2; __pyx_t_3+=1) {\n    __pyx_v_i = __pyx_t_3;\n\n    /* \"View.MemoryView\":1352\n * \n *     for i in range(offset):\n *         mslice.shape[i] = 1             # <<<<<<<<<<<<<<\n *         mslice.strides[i] = mslice.strides[0]\n *         mslice.suboffsets[i] = -1\n */\n    (__pyx_v_mslice->shape[__pyx_v_i]) = 1;\n\n    /* \"View.MemoryView\":1353\n *     for i in range(offset):\n *         mslice.shape[i] = 1\n *         mslice.strides[i] = mslice.strides[0]             # <<<<<<<<<<<<<<\n *         mslice.suboffsets[i] = -1\n * \n */\n    (__pyx_v_mslice->strides[__pyx_v_i]) = (__pyx_v_mslice->strides[0]);\n\n    /* \"View.MemoryView\":1354\n *         mslice.shape[i] = 1\n *         mslice.strides[i] = mslice.strides[0]\n *         mslice.suboffsets[i] = -1             # <<<<<<<<<<<<<<\n * \n * \n */\n    (__pyx_v_mslice->suboffsets[__pyx_v_i]) = -1L;\n  }\n\n  /* \"View.MemoryView\":1340\n * \n * @cname('__pyx_memoryview_broadcast_leading')\n * cdef void broadcast_leading(__Pyx_memviewslice *mslice,             # <<<<<<<<<<<<<<\n *                             int ndim,\n *                             int ndim_other) nogil:\n */\n\n  /* function exit code */\n}\n\n/* \"View.MemoryView\":1362\n * \n * @cname('__pyx_memoryview_refcount_copying')\n * cdef void refcount_copying(__Pyx_memviewslice *dst, bint dtype_is_object,             # <<<<<<<<<<<<<<\n *                            int ndim, bint inc) nogil:\n * \n */\n\nstatic void __pyx_memoryview_refcount_copying(__Pyx_memviewslice *__pyx_v_dst, int __pyx_v_dtype_is_object, int __pyx_v_ndim, int __pyx_v_inc) {\n  int __pyx_t_1;\n\n  /* \"View.MemoryView\":1366\n * \n * \n *     if dtype_is_object:             # <<<<<<<<<<<<<<\n *         refcount_objects_in_slice_with_gil(dst.data, dst.shape,\n *                                            dst.strides, ndim, inc)\n */\n  __pyx_t_1 = (__pyx_v_dtype_is_object != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":1367\n * \n *     if dtype_is_object:\n *         refcount_objects_in_slice_with_gil(dst.data, dst.shape,             # <<<<<<<<<<<<<<\n *                                            dst.strides, ndim, inc)\n * \n */\n    __pyx_memoryview_refcount_objects_in_slice_with_gil(__pyx_v_dst->data, __pyx_v_dst->shape, __pyx_v_dst->strides, __pyx_v_ndim, __pyx_v_inc);\n\n    /* \"View.MemoryView\":1366\n * \n * \n *     if dtype_is_object:             # <<<<<<<<<<<<<<\n *         refcount_objects_in_slice_with_gil(dst.data, dst.shape,\n *                                            dst.strides, ndim, inc)\n */\n  }\n\n  /* \"View.MemoryView\":1362\n * \n * @cname('__pyx_memoryview_refcount_copying')\n * cdef void refcount_copying(__Pyx_memviewslice *dst, bint dtype_is_object,             # <<<<<<<<<<<<<<\n *                            int ndim, bint inc) nogil:\n * \n */\n\n  /* function exit code */\n}\n\n/* \"View.MemoryView\":1371\n * \n * @cname('__pyx_memoryview_refcount_objects_in_slice_with_gil')\n * cdef void refcount_objects_in_slice_with_gil(char *data, Py_ssize_t *shape,             # <<<<<<<<<<<<<<\n *                                              Py_ssize_t *strides, int ndim,\n *                                              bint inc) with gil:\n */\n\nstatic void __pyx_memoryview_refcount_objects_in_slice_with_gil(char *__pyx_v_data, Py_ssize_t *__pyx_v_shape, Py_ssize_t *__pyx_v_strides, int __pyx_v_ndim, int __pyx_v_inc) {\n  __Pyx_RefNannyDeclarations\n  #ifdef WITH_THREAD\n  PyGILState_STATE __pyx_gilstate_save = __Pyx_PyGILState_Ensure();\n  #endif\n  __Pyx_RefNannySetupContext(\"refcount_objects_in_slice_with_gil\", 0);\n\n  /* \"View.MemoryView\":1374\n *                                              Py_ssize_t *strides, int ndim,\n *                                              bint inc) with gil:\n *     refcount_objects_in_slice(data, shape, strides, ndim, inc)             # <<<<<<<<<<<<<<\n * \n * @cname('__pyx_memoryview_refcount_objects_in_slice')\n */\n  __pyx_memoryview_refcount_objects_in_slice(__pyx_v_data, __pyx_v_shape, __pyx_v_strides, __pyx_v_ndim, __pyx_v_inc);\n\n  /* \"View.MemoryView\":1371\n * \n * @cname('__pyx_memoryview_refcount_objects_in_slice_with_gil')\n * cdef void refcount_objects_in_slice_with_gil(char *data, Py_ssize_t *shape,             # <<<<<<<<<<<<<<\n *                                              Py_ssize_t *strides, int ndim,\n *                                              bint inc) with gil:\n */\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  #ifdef WITH_THREAD\n  __Pyx_PyGILState_Release(__pyx_gilstate_save);\n  #endif\n}\n\n/* \"View.MemoryView\":1377\n * \n * @cname('__pyx_memoryview_refcount_objects_in_slice')\n * cdef void refcount_objects_in_slice(char *data, Py_ssize_t *shape,             # <<<<<<<<<<<<<<\n *                                     Py_ssize_t *strides, int ndim, bint inc):\n *     cdef Py_ssize_t i\n */\n\nstatic void __pyx_memoryview_refcount_objects_in_slice(char *__pyx_v_data, Py_ssize_t *__pyx_v_shape, Py_ssize_t *__pyx_v_strides, int __pyx_v_ndim, int __pyx_v_inc) {\n  CYTHON_UNUSED Py_ssize_t __pyx_v_i;\n  __Pyx_RefNannyDeclarations\n  Py_ssize_t __pyx_t_1;\n  Py_ssize_t __pyx_t_2;\n  Py_ssize_t __pyx_t_3;\n  int __pyx_t_4;\n  __Pyx_RefNannySetupContext(\"refcount_objects_in_slice\", 0);\n\n  /* \"View.MemoryView\":1381\n *     cdef Py_ssize_t i\n * \n *     for i in range(shape[0]):             # <<<<<<<<<<<<<<\n *         if ndim == 1:\n *             if inc:\n */\n  __pyx_t_1 = (__pyx_v_shape[0]);\n  __pyx_t_2 = __pyx_t_1;\n  for (__pyx_t_3 = 0; __pyx_t_3 < __pyx_t_2; __pyx_t_3+=1) {\n    __pyx_v_i = __pyx_t_3;\n\n    /* \"View.MemoryView\":1382\n * \n *     for i in range(shape[0]):\n *         if ndim == 1:             # <<<<<<<<<<<<<<\n *             if inc:\n *                 Py_INCREF((<PyObject **> data)[0])\n */\n    __pyx_t_4 = ((__pyx_v_ndim == 1) != 0);\n    if (__pyx_t_4) {\n\n      /* \"View.MemoryView\":1383\n *     for i in range(shape[0]):\n *         if ndim == 1:\n *             if inc:             # <<<<<<<<<<<<<<\n *                 Py_INCREF((<PyObject **> data)[0])\n *             else:\n */\n      __pyx_t_4 = (__pyx_v_inc != 0);\n      if (__pyx_t_4) {\n\n        /* \"View.MemoryView\":1384\n *         if ndim == 1:\n *             if inc:\n *                 Py_INCREF((<PyObject **> data)[0])             # <<<<<<<<<<<<<<\n *             else:\n *                 Py_DECREF((<PyObject **> data)[0])\n */\n        Py_INCREF((((PyObject **)__pyx_v_data)[0]));\n\n        /* \"View.MemoryView\":1383\n *     for i in range(shape[0]):\n *         if ndim == 1:\n *             if inc:             # <<<<<<<<<<<<<<\n *                 Py_INCREF((<PyObject **> data)[0])\n *             else:\n */\n        goto __pyx_L6;\n      }\n\n      /* \"View.MemoryView\":1386\n *                 Py_INCREF((<PyObject **> data)[0])\n *             else:\n *                 Py_DECREF((<PyObject **> data)[0])             # <<<<<<<<<<<<<<\n *         else:\n *             refcount_objects_in_slice(data, shape + 1, strides + 1,\n */\n      /*else*/ {\n        Py_DECREF((((PyObject **)__pyx_v_data)[0]));\n      }\n      __pyx_L6:;\n\n      /* \"View.MemoryView\":1382\n * \n *     for i in range(shape[0]):\n *         if ndim == 1:             # <<<<<<<<<<<<<<\n *             if inc:\n *                 Py_INCREF((<PyObject **> data)[0])\n */\n      goto __pyx_L5;\n    }\n\n    /* \"View.MemoryView\":1388\n *                 Py_DECREF((<PyObject **> data)[0])\n *         else:\n *             refcount_objects_in_slice(data, shape + 1, strides + 1,             # <<<<<<<<<<<<<<\n *                                       ndim - 1, inc)\n * \n */\n    /*else*/ {\n\n      /* \"View.MemoryView\":1389\n *         else:\n *             refcount_objects_in_slice(data, shape + 1, strides + 1,\n *                                       ndim - 1, inc)             # <<<<<<<<<<<<<<\n * \n *         data += strides[0]\n */\n      __pyx_memoryview_refcount_objects_in_slice(__pyx_v_data, (__pyx_v_shape + 1), (__pyx_v_strides + 1), (__pyx_v_ndim - 1), __pyx_v_inc);\n    }\n    __pyx_L5:;\n\n    /* \"View.MemoryView\":1391\n *                                       ndim - 1, inc)\n * \n *         data += strides[0]             # <<<<<<<<<<<<<<\n * \n * \n */\n    __pyx_v_data = (__pyx_v_data + (__pyx_v_strides[0]));\n  }\n\n  /* \"View.MemoryView\":1377\n * \n * @cname('__pyx_memoryview_refcount_objects_in_slice')\n * cdef void refcount_objects_in_slice(char *data, Py_ssize_t *shape,             # <<<<<<<<<<<<<<\n *                                     Py_ssize_t *strides, int ndim, bint inc):\n *     cdef Py_ssize_t i\n */\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n}\n\n/* \"View.MemoryView\":1397\n * \n * @cname('__pyx_memoryview_slice_assign_scalar')\n * cdef void slice_assign_scalar(__Pyx_memviewslice *dst, int ndim,             # <<<<<<<<<<<<<<\n *                               size_t itemsize, void *item,\n *                               bint dtype_is_object) nogil:\n */\n\nstatic void __pyx_memoryview_slice_assign_scalar(__Pyx_memviewslice *__pyx_v_dst, int __pyx_v_ndim, size_t __pyx_v_itemsize, void *__pyx_v_item, int __pyx_v_dtype_is_object) {\n\n  /* \"View.MemoryView\":1400\n *                               size_t itemsize, void *item,\n *                               bint dtype_is_object) nogil:\n *     refcount_copying(dst, dtype_is_object, ndim, False)             # <<<<<<<<<<<<<<\n *     _slice_assign_scalar(dst.data, dst.shape, dst.strides, ndim,\n *                          itemsize, item)\n */\n  __pyx_memoryview_refcount_copying(__pyx_v_dst, __pyx_v_dtype_is_object, __pyx_v_ndim, 0);\n\n  /* \"View.MemoryView\":1401\n *                               bint dtype_is_object) nogil:\n *     refcount_copying(dst, dtype_is_object, ndim, False)\n *     _slice_assign_scalar(dst.data, dst.shape, dst.strides, ndim,             # <<<<<<<<<<<<<<\n *                          itemsize, item)\n *     refcount_copying(dst, dtype_is_object, ndim, True)\n */\n  __pyx_memoryview__slice_assign_scalar(__pyx_v_dst->data, __pyx_v_dst->shape, __pyx_v_dst->strides, __pyx_v_ndim, __pyx_v_itemsize, __pyx_v_item);\n\n  /* \"View.MemoryView\":1403\n *     _slice_assign_scalar(dst.data, dst.shape, dst.strides, ndim,\n *                          itemsize, item)\n *     refcount_copying(dst, dtype_is_object, ndim, True)             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_memoryview_refcount_copying(__pyx_v_dst, __pyx_v_dtype_is_object, __pyx_v_ndim, 1);\n\n  /* \"View.MemoryView\":1397\n * \n * @cname('__pyx_memoryview_slice_assign_scalar')\n * cdef void slice_assign_scalar(__Pyx_memviewslice *dst, int ndim,             # <<<<<<<<<<<<<<\n *                               size_t itemsize, void *item,\n *                               bint dtype_is_object) nogil:\n */\n\n  /* function exit code */\n}\n\n/* \"View.MemoryView\":1407\n * \n * @cname('__pyx_memoryview__slice_assign_scalar')\n * cdef void _slice_assign_scalar(char *data, Py_ssize_t *shape,             # <<<<<<<<<<<<<<\n *                               Py_ssize_t *strides, int ndim,\n *                               size_t itemsize, void *item) nogil:\n */\n\nstatic void __pyx_memoryview__slice_assign_scalar(char *__pyx_v_data, Py_ssize_t *__pyx_v_shape, Py_ssize_t *__pyx_v_strides, int __pyx_v_ndim, size_t __pyx_v_itemsize, void *__pyx_v_item) {\n  CYTHON_UNUSED Py_ssize_t __pyx_v_i;\n  Py_ssize_t __pyx_v_stride;\n  Py_ssize_t __pyx_v_extent;\n  int __pyx_t_1;\n  Py_ssize_t __pyx_t_2;\n  Py_ssize_t __pyx_t_3;\n  Py_ssize_t __pyx_t_4;\n\n  /* \"View.MemoryView\":1411\n *                               size_t itemsize, void *item) nogil:\n *     cdef Py_ssize_t i\n *     cdef Py_ssize_t stride = strides[0]             # <<<<<<<<<<<<<<\n *     cdef Py_ssize_t extent = shape[0]\n * \n */\n  __pyx_v_stride = (__pyx_v_strides[0]);\n\n  /* \"View.MemoryView\":1412\n *     cdef Py_ssize_t i\n *     cdef Py_ssize_t stride = strides[0]\n *     cdef Py_ssize_t extent = shape[0]             # <<<<<<<<<<<<<<\n * \n *     if ndim == 1:\n */\n  __pyx_v_extent = (__pyx_v_shape[0]);\n\n  /* \"View.MemoryView\":1414\n *     cdef Py_ssize_t extent = shape[0]\n * \n *     if ndim == 1:             # <<<<<<<<<<<<<<\n *         for i in range(extent):\n *             memcpy(data, item, itemsize)\n */\n  __pyx_t_1 = ((__pyx_v_ndim == 1) != 0);\n  if (__pyx_t_1) {\n\n    /* \"View.MemoryView\":1415\n * \n *     if ndim == 1:\n *         for i in range(extent):             # <<<<<<<<<<<<<<\n *             memcpy(data, item, itemsize)\n *             data += stride\n */\n    __pyx_t_2 = __pyx_v_extent;\n    __pyx_t_3 = __pyx_t_2;\n    for (__pyx_t_4 = 0; __pyx_t_4 < __pyx_t_3; __pyx_t_4+=1) {\n      __pyx_v_i = __pyx_t_4;\n\n      /* \"View.MemoryView\":1416\n *     if ndim == 1:\n *         for i in range(extent):\n *             memcpy(data, item, itemsize)             # <<<<<<<<<<<<<<\n *             data += stride\n *     else:\n */\n      (void)(memcpy(__pyx_v_data, __pyx_v_item, __pyx_v_itemsize));\n\n      /* \"View.MemoryView\":1417\n *         for i in range(extent):\n *             memcpy(data, item, itemsize)\n *             data += stride             # <<<<<<<<<<<<<<\n *     else:\n *         for i in range(extent):\n */\n      __pyx_v_data = (__pyx_v_data + __pyx_v_stride);\n    }\n\n    /* \"View.MemoryView\":1414\n *     cdef Py_ssize_t extent = shape[0]\n * \n *     if ndim == 1:             # <<<<<<<<<<<<<<\n *         for i in range(extent):\n *             memcpy(data, item, itemsize)\n */\n    goto __pyx_L3;\n  }\n\n  /* \"View.MemoryView\":1419\n *             data += stride\n *     else:\n *         for i in range(extent):             # <<<<<<<<<<<<<<\n *             _slice_assign_scalar(data, shape + 1, strides + 1,\n *                                 ndim - 1, itemsize, item)\n */\n  /*else*/ {\n    __pyx_t_2 = __pyx_v_extent;\n    __pyx_t_3 = __pyx_t_2;\n    for (__pyx_t_4 = 0; __pyx_t_4 < __pyx_t_3; __pyx_t_4+=1) {\n      __pyx_v_i = __pyx_t_4;\n\n      /* \"View.MemoryView\":1420\n *     else:\n *         for i in range(extent):\n *             _slice_assign_scalar(data, shape + 1, strides + 1,             # <<<<<<<<<<<<<<\n *                                 ndim - 1, itemsize, item)\n *             data += stride\n */\n      __pyx_memoryview__slice_assign_scalar(__pyx_v_data, (__pyx_v_shape + 1), (__pyx_v_strides + 1), (__pyx_v_ndim - 1), __pyx_v_itemsize, __pyx_v_item);\n\n      /* \"View.MemoryView\":1422\n *             _slice_assign_scalar(data, shape + 1, strides + 1,\n *                                 ndim - 1, itemsize, item)\n *             data += stride             # <<<<<<<<<<<<<<\n * \n * \n */\n      __pyx_v_data = (__pyx_v_data + __pyx_v_stride);\n    }\n  }\n  __pyx_L3:;\n\n  /* \"View.MemoryView\":1407\n * \n * @cname('__pyx_memoryview__slice_assign_scalar')\n * cdef void _slice_assign_scalar(char *data, Py_ssize_t *shape,             # <<<<<<<<<<<<<<\n *                               Py_ssize_t *strides, int ndim,\n *                               size_t itemsize, void *item) nogil:\n */\n\n  /* function exit code */\n}\n\n/* \"(tree fragment)\":1\n * def __pyx_unpickle_Enum(__pyx_type, long __pyx_checksum, __pyx_state):             # <<<<<<<<<<<<<<\n *     cdef object __pyx_PickleError\n *     cdef object __pyx_result\n */\n\n/* Python wrapper */\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_1__pyx_unpickle_Enum(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/\nstatic PyMethodDef __pyx_mdef_15View_dot_MemoryView_1__pyx_unpickle_Enum = {\"__pyx_unpickle_Enum\", (PyCFunction)(void*)(PyCFunctionWithKeywords)__pyx_pw_15View_dot_MemoryView_1__pyx_unpickle_Enum, METH_VARARGS|METH_KEYWORDS, 0};\nstatic PyObject *__pyx_pw_15View_dot_MemoryView_1__pyx_unpickle_Enum(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds) {\n  PyObject *__pyx_v___pyx_type = 0;\n  long __pyx_v___pyx_checksum;\n  PyObject *__pyx_v___pyx_state = 0;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  PyObject *__pyx_r = 0;\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__pyx_unpickle_Enum (wrapper)\", 0);\n  {\n    static PyObject **__pyx_pyargnames[] = {&__pyx_n_s_pyx_type,&__pyx_n_s_pyx_checksum,&__pyx_n_s_pyx_state,0};\n    PyObject* values[3] = {0,0,0};\n    if (unlikely(__pyx_kwds)) {\n      Py_ssize_t kw_args;\n      const Py_ssize_t pos_args = PyTuple_GET_SIZE(__pyx_args);\n      switch (pos_args) {\n        case  3: values[2] = PyTuple_GET_ITEM(__pyx_args, 2);\n        CYTHON_FALLTHROUGH;\n        case  2: values[1] = PyTuple_GET_ITEM(__pyx_args, 1);\n        CYTHON_FALLTHROUGH;\n        case  1: values[0] = PyTuple_GET_ITEM(__pyx_args, 0);\n        CYTHON_FALLTHROUGH;\n        case  0: break;\n        default: goto __pyx_L5_argtuple_error;\n      }\n      kw_args = PyDict_Size(__pyx_kwds);\n      switch (pos_args) {\n        case  0:\n        if (likely((values[0] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_pyx_type)) != 0)) kw_args--;\n        else goto __pyx_L5_argtuple_error;\n        CYTHON_FALLTHROUGH;\n        case  1:\n        if (likely((values[1] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_pyx_checksum)) != 0)) kw_args--;\n        else {\n          __Pyx_RaiseArgtupleInvalid(\"__pyx_unpickle_Enum\", 1, 3, 3, 1); __PYX_ERR(2, 1, __pyx_L3_error)\n        }\n        CYTHON_FALLTHROUGH;\n        case  2:\n        if (likely((values[2] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_pyx_state)) != 0)) kw_args--;\n        else {\n          __Pyx_RaiseArgtupleInvalid(\"__pyx_unpickle_Enum\", 1, 3, 3, 2); __PYX_ERR(2, 1, __pyx_L3_error)\n        }\n      }\n      if (unlikely(kw_args > 0)) {\n        if (unlikely(__Pyx_ParseOptionalKeywords(__pyx_kwds, __pyx_pyargnames, 0, values, pos_args, \"__pyx_unpickle_Enum\") < 0)) __PYX_ERR(2, 1, __pyx_L3_error)\n      }\n    } else if (PyTuple_GET_SIZE(__pyx_args) != 3) {\n      goto __pyx_L5_argtuple_error;\n    } else {\n      values[0] = PyTuple_GET_ITEM(__pyx_args, 0);\n      values[1] = PyTuple_GET_ITEM(__pyx_args, 1);\n      values[2] = PyTuple_GET_ITEM(__pyx_args, 2);\n    }\n    __pyx_v___pyx_type = values[0];\n    __pyx_v___pyx_checksum = __Pyx_PyInt_As_long(values[1]); if (unlikely((__pyx_v___pyx_checksum == (long)-1) && PyErr_Occurred())) __PYX_ERR(2, 1, __pyx_L3_error)\n    __pyx_v___pyx_state = values[2];\n  }\n  goto __pyx_L4_argument_unpacking_done;\n  __pyx_L5_argtuple_error:;\n  __Pyx_RaiseArgtupleInvalid(\"__pyx_unpickle_Enum\", 1, 3, 3, PyTuple_GET_SIZE(__pyx_args)); __PYX_ERR(2, 1, __pyx_L3_error)\n  __pyx_L3_error:;\n  __Pyx_AddTraceback(\"View.MemoryView.__pyx_unpickle_Enum\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __Pyx_RefNannyFinishContext();\n  return NULL;\n  __pyx_L4_argument_unpacking_done:;\n  __pyx_r = __pyx_pf_15View_dot_MemoryView___pyx_unpickle_Enum(__pyx_self, __pyx_v___pyx_type, __pyx_v___pyx_checksum, __pyx_v___pyx_state);\n\n  /* function exit code */\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\nstatic PyObject *__pyx_pf_15View_dot_MemoryView___pyx_unpickle_Enum(CYTHON_UNUSED PyObject *__pyx_self, PyObject *__pyx_v___pyx_type, long __pyx_v___pyx_checksum, PyObject *__pyx_v___pyx_state) {\n  PyObject *__pyx_v___pyx_PickleError = 0;\n  PyObject *__pyx_v___pyx_result = 0;\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  int __pyx_t_1;\n  PyObject *__pyx_t_2 = NULL;\n  PyObject *__pyx_t_3 = NULL;\n  PyObject *__pyx_t_4 = NULL;\n  PyObject *__pyx_t_5 = NULL;\n  int __pyx_t_6;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__pyx_unpickle_Enum\", 0);\n\n  /* \"(tree fragment)\":4\n *     cdef object __pyx_PickleError\n *     cdef object __pyx_result\n *     if __pyx_checksum != 0xb068931:             # <<<<<<<<<<<<<<\n *         from pickle import PickleError as __pyx_PickleError\n *         raise __pyx_PickleError(\"Incompatible checksums (%s vs 0xb068931 = (name))\" % __pyx_checksum)\n */\n  __pyx_t_1 = ((__pyx_v___pyx_checksum != 0xb068931) != 0);\n  if (__pyx_t_1) {\n\n    /* \"(tree fragment)\":5\n *     cdef object __pyx_result\n *     if __pyx_checksum != 0xb068931:\n *         from pickle import PickleError as __pyx_PickleError             # <<<<<<<<<<<<<<\n *         raise __pyx_PickleError(\"Incompatible checksums (%s vs 0xb068931 = (name))\" % __pyx_checksum)\n *     __pyx_result = Enum.__new__(__pyx_type)\n */\n    __pyx_t_2 = PyList_New(1); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 5, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __Pyx_INCREF(__pyx_n_s_PickleError);\n    __Pyx_GIVEREF(__pyx_n_s_PickleError);\n    PyList_SET_ITEM(__pyx_t_2, 0, __pyx_n_s_PickleError);\n    __pyx_t_3 = __Pyx_Import(__pyx_n_s_pickle, __pyx_t_2, 0); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 5, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n    __pyx_t_2 = __Pyx_ImportFrom(__pyx_t_3, __pyx_n_s_PickleError); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 5, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __Pyx_INCREF(__pyx_t_2);\n    __pyx_v___pyx_PickleError = __pyx_t_2;\n    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n\n    /* \"(tree fragment)\":6\n *     if __pyx_checksum != 0xb068931:\n *         from pickle import PickleError as __pyx_PickleError\n *         raise __pyx_PickleError(\"Incompatible checksums (%s vs 0xb068931 = (name))\" % __pyx_checksum)             # <<<<<<<<<<<<<<\n *     __pyx_result = Enum.__new__(__pyx_type)\n *     if __pyx_state is not None:\n */\n    __pyx_t_2 = __Pyx_PyInt_From_long(__pyx_v___pyx_checksum); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 6, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_2);\n    __pyx_t_4 = __Pyx_PyString_Format(__pyx_kp_s_Incompatible_checksums_s_vs_0xb0, __pyx_t_2); if (unlikely(!__pyx_t_4)) __PYX_ERR(2, 6, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_4);\n    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n    __Pyx_INCREF(__pyx_v___pyx_PickleError);\n    __pyx_t_2 = __pyx_v___pyx_PickleError; __pyx_t_5 = NULL;\n    if (CYTHON_UNPACK_METHODS && unlikely(PyMethod_Check(__pyx_t_2))) {\n      __pyx_t_5 = PyMethod_GET_SELF(__pyx_t_2);\n      if (likely(__pyx_t_5)) {\n        PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_2);\n        __Pyx_INCREF(__pyx_t_5);\n        __Pyx_INCREF(function);\n        __Pyx_DECREF_SET(__pyx_t_2, function);\n      }\n    }\n    __pyx_t_3 = (__pyx_t_5) ? __Pyx_PyObject_Call2Args(__pyx_t_2, __pyx_t_5, __pyx_t_4) : __Pyx_PyObject_CallOneArg(__pyx_t_2, __pyx_t_4);\n    __Pyx_XDECREF(__pyx_t_5); __pyx_t_5 = 0;\n    __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;\n    if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 6, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n    __Pyx_Raise(__pyx_t_3, 0, 0, 0);\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n    __PYX_ERR(2, 6, __pyx_L1_error)\n\n    /* \"(tree fragment)\":4\n *     cdef object __pyx_PickleError\n *     cdef object __pyx_result\n *     if __pyx_checksum != 0xb068931:             # <<<<<<<<<<<<<<\n *         from pickle import PickleError as __pyx_PickleError\n *         raise __pyx_PickleError(\"Incompatible checksums (%s vs 0xb068931 = (name))\" % __pyx_checksum)\n */\n  }\n\n  /* \"(tree fragment)\":7\n *         from pickle import PickleError as __pyx_PickleError\n *         raise __pyx_PickleError(\"Incompatible checksums (%s vs 0xb068931 = (name))\" % __pyx_checksum)\n *     __pyx_result = Enum.__new__(__pyx_type)             # <<<<<<<<<<<<<<\n *     if __pyx_state is not None:\n *         __pyx_unpickle_Enum__set_state(<Enum> __pyx_result, __pyx_state)\n */\n  __pyx_t_2 = __Pyx_PyObject_GetAttrStr(((PyObject *)__pyx_MemviewEnum_type), __pyx_n_s_new); if (unlikely(!__pyx_t_2)) __PYX_ERR(2, 7, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_2);\n  __pyx_t_4 = NULL;\n  if (CYTHON_UNPACK_METHODS && likely(PyMethod_Check(__pyx_t_2))) {\n    __pyx_t_4 = PyMethod_GET_SELF(__pyx_t_2);\n    if (likely(__pyx_t_4)) {\n      PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_2);\n      __Pyx_INCREF(__pyx_t_4);\n      __Pyx_INCREF(function);\n      __Pyx_DECREF_SET(__pyx_t_2, function);\n    }\n  }\n  __pyx_t_3 = (__pyx_t_4) ? __Pyx_PyObject_Call2Args(__pyx_t_2, __pyx_t_4, __pyx_v___pyx_type) : __Pyx_PyObject_CallOneArg(__pyx_t_2, __pyx_v___pyx_type);\n  __Pyx_XDECREF(__pyx_t_4); __pyx_t_4 = 0;\n  if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 7, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_3);\n  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;\n  __pyx_v___pyx_result = __pyx_t_3;\n  __pyx_t_3 = 0;\n\n  /* \"(tree fragment)\":8\n *         raise __pyx_PickleError(\"Incompatible checksums (%s vs 0xb068931 = (name))\" % __pyx_checksum)\n *     __pyx_result = Enum.__new__(__pyx_type)\n *     if __pyx_state is not None:             # <<<<<<<<<<<<<<\n *         __pyx_unpickle_Enum__set_state(<Enum> __pyx_result, __pyx_state)\n *     return __pyx_result\n */\n  __pyx_t_1 = (__pyx_v___pyx_state != Py_None);\n  __pyx_t_6 = (__pyx_t_1 != 0);\n  if (__pyx_t_6) {\n\n    /* \"(tree fragment)\":9\n *     __pyx_result = Enum.__new__(__pyx_type)\n *     if __pyx_state is not None:\n *         __pyx_unpickle_Enum__set_state(<Enum> __pyx_result, __pyx_state)             # <<<<<<<<<<<<<<\n *     return __pyx_result\n * cdef __pyx_unpickle_Enum__set_state(Enum __pyx_result, tuple __pyx_state):\n */\n    if (!(likely(PyTuple_CheckExact(__pyx_v___pyx_state))||((__pyx_v___pyx_state) == Py_None)||(PyErr_Format(PyExc_TypeError, \"Expected %.16s, got %.200s\", \"tuple\", Py_TYPE(__pyx_v___pyx_state)->tp_name), 0))) __PYX_ERR(2, 9, __pyx_L1_error)\n    __pyx_t_3 = __pyx_unpickle_Enum__set_state(((struct __pyx_MemviewEnum_obj *)__pyx_v___pyx_result), ((PyObject*)__pyx_v___pyx_state)); if (unlikely(!__pyx_t_3)) __PYX_ERR(2, 9, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_3);\n    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;\n\n    /* \"(tree fragment)\":8\n *         raise __pyx_PickleError(\"Incompatible checksums (%s vs 0xb068931 = (name))\" % __pyx_checksum)\n *     __pyx_result = Enum.__new__(__pyx_type)\n *     if __pyx_state is not None:             # <<<<<<<<<<<<<<\n *         __pyx_unpickle_Enum__set_state(<Enum> __pyx_result, __pyx_state)\n *     return __pyx_result\n */\n  }\n\n  /* \"(tree fragment)\":10\n *     if __pyx_state is not None:\n *         __pyx_unpickle_Enum__set_state(<Enum> __pyx_result, __pyx_state)\n *     return __pyx_result             # <<<<<<<<<<<<<<\n * cdef __pyx_unpickle_Enum__set_state(Enum __pyx_result, tuple __pyx_state):\n *     __pyx_result.name = __pyx_state[0]\n */\n  __Pyx_XDECREF(__pyx_r);\n  __Pyx_INCREF(__pyx_v___pyx_result);\n  __pyx_r = __pyx_v___pyx_result;\n  goto __pyx_L0;\n\n  /* \"(tree fragment)\":1\n * def __pyx_unpickle_Enum(__pyx_type, long __pyx_checksum, __pyx_state):             # <<<<<<<<<<<<<<\n *     cdef object __pyx_PickleError\n *     cdef object __pyx_result\n */\n\n  /* function exit code */\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_2);\n  __Pyx_XDECREF(__pyx_t_3);\n  __Pyx_XDECREF(__pyx_t_4);\n  __Pyx_XDECREF(__pyx_t_5);\n  __Pyx_AddTraceback(\"View.MemoryView.__pyx_unpickle_Enum\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = NULL;\n  __pyx_L0:;\n  __Pyx_XDECREF(__pyx_v___pyx_PickleError);\n  __Pyx_XDECREF(__pyx_v___pyx_result);\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\n\n/* \"(tree fragment)\":11\n *         __pyx_unpickle_Enum__set_state(<Enum> __pyx_result, __pyx_state)\n *     return __pyx_result\n * cdef __pyx_unpickle_Enum__set_state(Enum __pyx_result, tuple __pyx_state):             # <<<<<<<<<<<<<<\n *     __pyx_result.name = __pyx_state[0]\n *     if len(__pyx_state) > 1 and hasattr(__pyx_result, '__dict__'):\n */\n\nstatic PyObject *__pyx_unpickle_Enum__set_state(struct __pyx_MemviewEnum_obj *__pyx_v___pyx_result, PyObject *__pyx_v___pyx_state) {\n  PyObject *__pyx_r = NULL;\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_t_2;\n  Py_ssize_t __pyx_t_3;\n  int __pyx_t_4;\n  int __pyx_t_5;\n  PyObject *__pyx_t_6 = NULL;\n  PyObject *__pyx_t_7 = NULL;\n  PyObject *__pyx_t_8 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__pyx_unpickle_Enum__set_state\", 0);\n\n  /* \"(tree fragment)\":12\n *     return __pyx_result\n * cdef __pyx_unpickle_Enum__set_state(Enum __pyx_result, tuple __pyx_state):\n *     __pyx_result.name = __pyx_state[0]             # <<<<<<<<<<<<<<\n *     if len(__pyx_state) > 1 and hasattr(__pyx_result, '__dict__'):\n *         __pyx_result.__dict__.update(__pyx_state[1])\n */\n  if (unlikely(__pyx_v___pyx_state == Py_None)) {\n    PyErr_SetString(PyExc_TypeError, \"'NoneType' object is not subscriptable\");\n    __PYX_ERR(2, 12, __pyx_L1_error)\n  }\n  __pyx_t_1 = __Pyx_GetItemInt_Tuple(__pyx_v___pyx_state, 0, long, 1, __Pyx_PyInt_From_long, 0, 0, 1); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 12, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_GIVEREF(__pyx_t_1);\n  __Pyx_GOTREF(__pyx_v___pyx_result->name);\n  __Pyx_DECREF(__pyx_v___pyx_result->name);\n  __pyx_v___pyx_result->name = __pyx_t_1;\n  __pyx_t_1 = 0;\n\n  /* \"(tree fragment)\":13\n * cdef __pyx_unpickle_Enum__set_state(Enum __pyx_result, tuple __pyx_state):\n *     __pyx_result.name = __pyx_state[0]\n *     if len(__pyx_state) > 1 and hasattr(__pyx_result, '__dict__'):             # <<<<<<<<<<<<<<\n *         __pyx_result.__dict__.update(__pyx_state[1])\n */\n  if (unlikely(__pyx_v___pyx_state == Py_None)) {\n    PyErr_SetString(PyExc_TypeError, \"object of type 'NoneType' has no len()\");\n    __PYX_ERR(2, 13, __pyx_L1_error)\n  }\n  __pyx_t_3 = PyTuple_GET_SIZE(__pyx_v___pyx_state); if (unlikely(__pyx_t_3 == ((Py_ssize_t)-1))) __PYX_ERR(2, 13, __pyx_L1_error)\n  __pyx_t_4 = ((__pyx_t_3 > 1) != 0);\n  if (__pyx_t_4) {\n  } else {\n    __pyx_t_2 = __pyx_t_4;\n    goto __pyx_L4_bool_binop_done;\n  }\n  __pyx_t_4 = __Pyx_HasAttr(((PyObject *)__pyx_v___pyx_result), __pyx_n_s_dict); if (unlikely(__pyx_t_4 == ((int)-1))) __PYX_ERR(2, 13, __pyx_L1_error)\n  __pyx_t_5 = (__pyx_t_4 != 0);\n  __pyx_t_2 = __pyx_t_5;\n  __pyx_L4_bool_binop_done:;\n  if (__pyx_t_2) {\n\n    /* \"(tree fragment)\":14\n *     __pyx_result.name = __pyx_state[0]\n *     if len(__pyx_state) > 1 and hasattr(__pyx_result, '__dict__'):\n *         __pyx_result.__dict__.update(__pyx_state[1])             # <<<<<<<<<<<<<<\n */\n    __pyx_t_6 = __Pyx_PyObject_GetAttrStr(((PyObject *)__pyx_v___pyx_result), __pyx_n_s_dict); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 14, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_6);\n    __pyx_t_7 = __Pyx_PyObject_GetAttrStr(__pyx_t_6, __pyx_n_s_update); if (unlikely(!__pyx_t_7)) __PYX_ERR(2, 14, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_7);\n    __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;\n    if (unlikely(__pyx_v___pyx_state == Py_None)) {\n      PyErr_SetString(PyExc_TypeError, \"'NoneType' object is not subscriptable\");\n      __PYX_ERR(2, 14, __pyx_L1_error)\n    }\n    __pyx_t_6 = __Pyx_GetItemInt_Tuple(__pyx_v___pyx_state, 1, long, 1, __Pyx_PyInt_From_long, 0, 0, 1); if (unlikely(!__pyx_t_6)) __PYX_ERR(2, 14, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_6);\n    __pyx_t_8 = NULL;\n    if (CYTHON_UNPACK_METHODS && likely(PyMethod_Check(__pyx_t_7))) {\n      __pyx_t_8 = PyMethod_GET_SELF(__pyx_t_7);\n      if (likely(__pyx_t_8)) {\n        PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_7);\n        __Pyx_INCREF(__pyx_t_8);\n        __Pyx_INCREF(function);\n        __Pyx_DECREF_SET(__pyx_t_7, function);\n      }\n    }\n    __pyx_t_1 = (__pyx_t_8) ? __Pyx_PyObject_Call2Args(__pyx_t_7, __pyx_t_8, __pyx_t_6) : __Pyx_PyObject_CallOneArg(__pyx_t_7, __pyx_t_6);\n    __Pyx_XDECREF(__pyx_t_8); __pyx_t_8 = 0;\n    __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;\n    if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 14, __pyx_L1_error)\n    __Pyx_GOTREF(__pyx_t_1);\n    __Pyx_DECREF(__pyx_t_7); __pyx_t_7 = 0;\n    __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n\n    /* \"(tree fragment)\":13\n * cdef __pyx_unpickle_Enum__set_state(Enum __pyx_result, tuple __pyx_state):\n *     __pyx_result.name = __pyx_state[0]\n *     if len(__pyx_state) > 1 and hasattr(__pyx_result, '__dict__'):             # <<<<<<<<<<<<<<\n *         __pyx_result.__dict__.update(__pyx_state[1])\n */\n  }\n\n  /* \"(tree fragment)\":11\n *         __pyx_unpickle_Enum__set_state(<Enum> __pyx_result, __pyx_state)\n *     return __pyx_result\n * cdef __pyx_unpickle_Enum__set_state(Enum __pyx_result, tuple __pyx_state):             # <<<<<<<<<<<<<<\n *     __pyx_result.name = __pyx_state[0]\n *     if len(__pyx_state) > 1 and hasattr(__pyx_result, '__dict__'):\n */\n\n  /* function exit code */\n  __pyx_r = Py_None; __Pyx_INCREF(Py_None);\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_XDECREF(__pyx_t_6);\n  __Pyx_XDECREF(__pyx_t_7);\n  __Pyx_XDECREF(__pyx_t_8);\n  __Pyx_AddTraceback(\"View.MemoryView.__pyx_unpickle_Enum__set_state\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n  __pyx_r = 0;\n  __pyx_L0:;\n  __Pyx_XGIVEREF(__pyx_r);\n  __Pyx_RefNannyFinishContext();\n  return __pyx_r;\n}\nstatic struct __pyx_vtabstruct_array __pyx_vtable_array;\n\nstatic PyObject *__pyx_tp_new_array(PyTypeObject *t, PyObject *a, PyObject *k) {\n  struct __pyx_array_obj *p;\n  PyObject *o;\n  if (likely((t->tp_flags & Py_TPFLAGS_IS_ABSTRACT) == 0)) {\n    o = (*t->tp_alloc)(t, 0);\n  } else {\n    o = (PyObject *) PyBaseObject_Type.tp_new(t, __pyx_empty_tuple, 0);\n  }\n  if (unlikely(!o)) return 0;\n  p = ((struct __pyx_array_obj *)o);\n  p->__pyx_vtab = __pyx_vtabptr_array;\n  p->mode = ((PyObject*)Py_None); Py_INCREF(Py_None);\n  p->_format = ((PyObject*)Py_None); Py_INCREF(Py_None);\n  if (unlikely(__pyx_array___cinit__(o, a, k) < 0)) goto bad;\n  return o;\n  bad:\n  Py_DECREF(o); o = 0;\n  return NULL;\n}\n\nstatic void __pyx_tp_dealloc_array(PyObject *o) {\n  struct __pyx_array_obj *p = (struct __pyx_array_obj *)o;\n  #if CYTHON_USE_TP_FINALIZE\n  if (unlikely(PyType_HasFeature(Py_TYPE(o), Py_TPFLAGS_HAVE_FINALIZE) && Py_TYPE(o)->tp_finalize) && (!PyType_IS_GC(Py_TYPE(o)) || !_PyGC_FINALIZED(o))) {\n    if (PyObject_CallFinalizerFromDealloc(o)) return;\n  }\n  #endif\n  {\n    PyObject *etype, *eval, *etb;\n    PyErr_Fetch(&etype, &eval, &etb);\n    __Pyx_SET_REFCNT(o, Py_REFCNT(o) + 1);\n    __pyx_array___dealloc__(o);\n    __Pyx_SET_REFCNT(o, Py_REFCNT(o) - 1);\n    PyErr_Restore(etype, eval, etb);\n  }\n  Py_CLEAR(p->mode);\n  Py_CLEAR(p->_format);\n  (*Py_TYPE(o)->tp_free)(o);\n}\nstatic PyObject *__pyx_sq_item_array(PyObject *o, Py_ssize_t i) {\n  PyObject *r;\n  PyObject *x = PyInt_FromSsize_t(i); if(!x) return 0;\n  r = Py_TYPE(o)->tp_as_mapping->mp_subscript(o, x);\n  Py_DECREF(x);\n  return r;\n}\n\nstatic int __pyx_mp_ass_subscript_array(PyObject *o, PyObject *i, PyObject *v) {\n  if (v) {\n    return __pyx_array___setitem__(o, i, v);\n  }\n  else {\n    PyErr_Format(PyExc_NotImplementedError,\n      \"Subscript deletion not supported by %.200s\", Py_TYPE(o)->tp_name);\n    return -1;\n  }\n}\n\nstatic PyObject *__pyx_tp_getattro_array(PyObject *o, PyObject *n) {\n  PyObject *v = __Pyx_PyObject_GenericGetAttr(o, n);\n  if (!v && PyErr_ExceptionMatches(PyExc_AttributeError)) {\n    PyErr_Clear();\n    v = __pyx_array___getattr__(o, n);\n  }\n  return v;\n}\n\nstatic PyObject *__pyx_getprop___pyx_array_memview(PyObject *o, CYTHON_UNUSED void *x) {\n  return __pyx_pw_15View_dot_MemoryView_5array_7memview_1__get__(o);\n}\n\nstatic PyMethodDef __pyx_methods_array[] = {\n  {\"__getattr__\", (PyCFunction)__pyx_array___getattr__, METH_O|METH_COEXIST, 0},\n  {\"__reduce_cython__\", (PyCFunction)__pyx_pw___pyx_array_1__reduce_cython__, METH_NOARGS, 0},\n  {\"__setstate_cython__\", (PyCFunction)__pyx_pw___pyx_array_3__setstate_cython__, METH_O, 0},\n  {0, 0, 0, 0}\n};\n\nstatic struct PyGetSetDef __pyx_getsets_array[] = {\n  {(char *)\"memview\", __pyx_getprop___pyx_array_memview, 0, (char *)0, 0},\n  {0, 0, 0, 0, 0}\n};\n\nstatic PySequenceMethods __pyx_tp_as_sequence_array = {\n  __pyx_array___len__, /*sq_length*/\n  0, /*sq_concat*/\n  0, /*sq_repeat*/\n  __pyx_sq_item_array, /*sq_item*/\n  0, /*sq_slice*/\n  0, /*sq_ass_item*/\n  0, /*sq_ass_slice*/\n  0, /*sq_contains*/\n  0, /*sq_inplace_concat*/\n  0, /*sq_inplace_repeat*/\n};\n\nstatic PyMappingMethods __pyx_tp_as_mapping_array = {\n  __pyx_array___len__, /*mp_length*/\n  __pyx_array___getitem__, /*mp_subscript*/\n  __pyx_mp_ass_subscript_array, /*mp_ass_subscript*/\n};\n\nstatic PyBufferProcs __pyx_tp_as_buffer_array = {\n  #if PY_MAJOR_VERSION < 3\n  0, /*bf_getreadbuffer*/\n  #endif\n  #if PY_MAJOR_VERSION < 3\n  0, /*bf_getwritebuffer*/\n  #endif\n  #if PY_MAJOR_VERSION < 3\n  0, /*bf_getsegcount*/\n  #endif\n  #if PY_MAJOR_VERSION < 3\n  0, /*bf_getcharbuffer*/\n  #endif\n  __pyx_array_getbuffer, /*bf_getbuffer*/\n  0, /*bf_releasebuffer*/\n};\n\nstatic PyTypeObject __pyx_type___pyx_array = {\n  PyVarObject_HEAD_INIT(0, 0)\n  \"TTS.tts.utils.monotonic_align.core.array\", /*tp_name*/\n  sizeof(struct __pyx_array_obj), /*tp_basicsize*/\n  0, /*tp_itemsize*/\n  __pyx_tp_dealloc_array, /*tp_dealloc*/\n  #if PY_VERSION_HEX < 0x030800b4\n  0, /*tp_print*/\n  #endif\n  #if PY_VERSION_HEX >= 0x030800b4\n  0, /*tp_vectorcall_offset*/\n  #endif\n  0, /*tp_getattr*/\n  0, /*tp_setattr*/\n  #if PY_MAJOR_VERSION < 3\n  0, /*tp_compare*/\n  #endif\n  #if PY_MAJOR_VERSION >= 3\n  0, /*tp_as_async*/\n  #endif\n  0, /*tp_repr*/\n  0, /*tp_as_number*/\n  &__pyx_tp_as_sequence_array, /*tp_as_sequence*/\n  &__pyx_tp_as_mapping_array, /*tp_as_mapping*/\n  0, /*tp_hash*/\n  0, /*tp_call*/\n  0, /*tp_str*/\n  __pyx_tp_getattro_array, /*tp_getattro*/\n  0, /*tp_setattro*/\n  &__pyx_tp_as_buffer_array, /*tp_as_buffer*/\n  Py_TPFLAGS_DEFAULT|Py_TPFLAGS_HAVE_VERSION_TAG|Py_TPFLAGS_CHECKTYPES|Py_TPFLAGS_HAVE_NEWBUFFER|Py_TPFLAGS_BASETYPE, /*tp_flags*/\n  0, /*tp_doc*/\n  0, /*tp_traverse*/\n  0, /*tp_clear*/\n  0, /*tp_richcompare*/\n  0, /*tp_weaklistoffset*/\n  0, /*tp_iter*/\n  0, /*tp_iternext*/\n  __pyx_methods_array, /*tp_methods*/\n  0, /*tp_members*/\n  __pyx_getsets_array, /*tp_getset*/\n  0, /*tp_base*/\n  0, /*tp_dict*/\n  0, /*tp_descr_get*/\n  0, /*tp_descr_set*/\n  0, /*tp_dictoffset*/\n  0, /*tp_init*/\n  0, /*tp_alloc*/\n  __pyx_tp_new_array, /*tp_new*/\n  0, /*tp_free*/\n  0, /*tp_is_gc*/\n  0, /*tp_bases*/\n  0, /*tp_mro*/\n  0, /*tp_cache*/\n  0, /*tp_subclasses*/\n  0, /*tp_weaklist*/\n  0, /*tp_del*/\n  0, /*tp_version_tag*/\n  #if PY_VERSION_HEX >= 0x030400a1\n  0, /*tp_finalize*/\n  #endif\n  #if PY_VERSION_HEX >= 0x030800b1 && (!CYTHON_COMPILING_IN_PYPY || PYPY_VERSION_NUM >= 0x07030800)\n  0, /*tp_vectorcall*/\n  #endif\n  #if PY_VERSION_HEX >= 0x030800b4 && PY_VERSION_HEX < 0x03090000\n  0, /*tp_print*/\n  #endif\n  #if CYTHON_COMPILING_IN_PYPY && PY_VERSION_HEX >= 0x03090000\n  0, /*tp_pypy_flags*/\n  #endif\n};\n\nstatic PyObject *__pyx_tp_new_Enum(PyTypeObject *t, CYTHON_UNUSED PyObject *a, CYTHON_UNUSED PyObject *k) {\n  struct __pyx_MemviewEnum_obj *p;\n  PyObject *o;\n  if (likely((t->tp_flags & Py_TPFLAGS_IS_ABSTRACT) == 0)) {\n    o = (*t->tp_alloc)(t, 0);\n  } else {\n    o = (PyObject *) PyBaseObject_Type.tp_new(t, __pyx_empty_tuple, 0);\n  }\n  if (unlikely(!o)) return 0;\n  p = ((struct __pyx_MemviewEnum_obj *)o);\n  p->name = Py_None; Py_INCREF(Py_None);\n  return o;\n}\n\nstatic void __pyx_tp_dealloc_Enum(PyObject *o) {\n  struct __pyx_MemviewEnum_obj *p = (struct __pyx_MemviewEnum_obj *)o;\n  #if CYTHON_USE_TP_FINALIZE\n  if (unlikely(PyType_HasFeature(Py_TYPE(o), Py_TPFLAGS_HAVE_FINALIZE) && Py_TYPE(o)->tp_finalize) && !_PyGC_FINALIZED(o)) {\n    if (PyObject_CallFinalizerFromDealloc(o)) return;\n  }\n  #endif\n  PyObject_GC_UnTrack(o);\n  Py_CLEAR(p->name);\n  (*Py_TYPE(o)->tp_free)(o);\n}\n\nstatic int __pyx_tp_traverse_Enum(PyObject *o, visitproc v, void *a) {\n  int e;\n  struct __pyx_MemviewEnum_obj *p = (struct __pyx_MemviewEnum_obj *)o;\n  if (p->name) {\n    e = (*v)(p->name, a); if (e) return e;\n  }\n  return 0;\n}\n\nstatic int __pyx_tp_clear_Enum(PyObject *o) {\n  PyObject* tmp;\n  struct __pyx_MemviewEnum_obj *p = (struct __pyx_MemviewEnum_obj *)o;\n  tmp = ((PyObject*)p->name);\n  p->name = Py_None; Py_INCREF(Py_None);\n  Py_XDECREF(tmp);\n  return 0;\n}\n\nstatic PyMethodDef __pyx_methods_Enum[] = {\n  {\"__reduce_cython__\", (PyCFunction)__pyx_pw___pyx_MemviewEnum_1__reduce_cython__, METH_NOARGS, 0},\n  {\"__setstate_cython__\", (PyCFunction)__pyx_pw___pyx_MemviewEnum_3__setstate_cython__, METH_O, 0},\n  {0, 0, 0, 0}\n};\n\nstatic PyTypeObject __pyx_type___pyx_MemviewEnum = {\n  PyVarObject_HEAD_INIT(0, 0)\n  \"TTS.tts.utils.monotonic_align.core.Enum\", /*tp_name*/\n  sizeof(struct __pyx_MemviewEnum_obj), /*tp_basicsize*/\n  0, /*tp_itemsize*/\n  __pyx_tp_dealloc_Enum, /*tp_dealloc*/\n  #if PY_VERSION_HEX < 0x030800b4\n  0, /*tp_print*/\n  #endif\n  #if PY_VERSION_HEX >= 0x030800b4\n  0, /*tp_vectorcall_offset*/\n  #endif\n  0, /*tp_getattr*/\n  0, /*tp_setattr*/\n  #if PY_MAJOR_VERSION < 3\n  0, /*tp_compare*/\n  #endif\n  #if PY_MAJOR_VERSION >= 3\n  0, /*tp_as_async*/\n  #endif\n  __pyx_MemviewEnum___repr__, /*tp_repr*/\n  0, /*tp_as_number*/\n  0, /*tp_as_sequence*/\n  0, /*tp_as_mapping*/\n  0, /*tp_hash*/\n  0, /*tp_call*/\n  0, /*tp_str*/\n  0, /*tp_getattro*/\n  0, /*tp_setattro*/\n  0, /*tp_as_buffer*/\n  Py_TPFLAGS_DEFAULT|Py_TPFLAGS_HAVE_VERSION_TAG|Py_TPFLAGS_CHECKTYPES|Py_TPFLAGS_HAVE_NEWBUFFER|Py_TPFLAGS_BASETYPE|Py_TPFLAGS_HAVE_GC, /*tp_flags*/\n  0, /*tp_doc*/\n  __pyx_tp_traverse_Enum, /*tp_traverse*/\n  __pyx_tp_clear_Enum, /*tp_clear*/\n  0, /*tp_richcompare*/\n  0, /*tp_weaklistoffset*/\n  0, /*tp_iter*/\n  0, /*tp_iternext*/\n  __pyx_methods_Enum, /*tp_methods*/\n  0, /*tp_members*/\n  0, /*tp_getset*/\n  0, /*tp_base*/\n  0, /*tp_dict*/\n  0, /*tp_descr_get*/\n  0, /*tp_descr_set*/\n  0, /*tp_dictoffset*/\n  __pyx_MemviewEnum___init__, /*tp_init*/\n  0, /*tp_alloc*/\n  __pyx_tp_new_Enum, /*tp_new*/\n  0, /*tp_free*/\n  0, /*tp_is_gc*/\n  0, /*tp_bases*/\n  0, /*tp_mro*/\n  0, /*tp_cache*/\n  0, /*tp_subclasses*/\n  0, /*tp_weaklist*/\n  0, /*tp_del*/\n  0, /*tp_version_tag*/\n  #if PY_VERSION_HEX >= 0x030400a1\n  0, /*tp_finalize*/\n  #endif\n  #if PY_VERSION_HEX >= 0x030800b1 && (!CYTHON_COMPILING_IN_PYPY || PYPY_VERSION_NUM >= 0x07030800)\n  0, /*tp_vectorcall*/\n  #endif\n  #if PY_VERSION_HEX >= 0x030800b4 && PY_VERSION_HEX < 0x03090000\n  0, /*tp_print*/\n  #endif\n  #if CYTHON_COMPILING_IN_PYPY && PY_VERSION_HEX >= 0x03090000\n  0, /*tp_pypy_flags*/\n  #endif\n};\nstatic struct __pyx_vtabstruct_memoryview __pyx_vtable_memoryview;\n\nstatic PyObject *__pyx_tp_new_memoryview(PyTypeObject *t, PyObject *a, PyObject *k) {\n  struct __pyx_memoryview_obj *p;\n  PyObject *o;\n  if (likely((t->tp_flags & Py_TPFLAGS_IS_ABSTRACT) == 0)) {\n    o = (*t->tp_alloc)(t, 0);\n  } else {\n    o = (PyObject *) PyBaseObject_Type.tp_new(t, __pyx_empty_tuple, 0);\n  }\n  if (unlikely(!o)) return 0;\n  p = ((struct __pyx_memoryview_obj *)o);\n  p->__pyx_vtab = __pyx_vtabptr_memoryview;\n  p->obj = Py_None; Py_INCREF(Py_None);\n  p->_size = Py_None; Py_INCREF(Py_None);\n  p->_array_interface = Py_None; Py_INCREF(Py_None);\n  p->view.obj = NULL;\n  if (unlikely(__pyx_memoryview___cinit__(o, a, k) < 0)) goto bad;\n  return o;\n  bad:\n  Py_DECREF(o); o = 0;\n  return NULL;\n}\n\nstatic void __pyx_tp_dealloc_memoryview(PyObject *o) {\n  struct __pyx_memoryview_obj *p = (struct __pyx_memoryview_obj *)o;\n  #if CYTHON_USE_TP_FINALIZE\n  if (unlikely(PyType_HasFeature(Py_TYPE(o), Py_TPFLAGS_HAVE_FINALIZE) && Py_TYPE(o)->tp_finalize) && !_PyGC_FINALIZED(o)) {\n    if (PyObject_CallFinalizerFromDealloc(o)) return;\n  }\n  #endif\n  PyObject_GC_UnTrack(o);\n  {\n    PyObject *etype, *eval, *etb;\n    PyErr_Fetch(&etype, &eval, &etb);\n    __Pyx_SET_REFCNT(o, Py_REFCNT(o) + 1);\n    __pyx_memoryview___dealloc__(o);\n    __Pyx_SET_REFCNT(o, Py_REFCNT(o) - 1);\n    PyErr_Restore(etype, eval, etb);\n  }\n  Py_CLEAR(p->obj);\n  Py_CLEAR(p->_size);\n  Py_CLEAR(p->_array_interface);\n  (*Py_TYPE(o)->tp_free)(o);\n}\n\nstatic int __pyx_tp_traverse_memoryview(PyObject *o, visitproc v, void *a) {\n  int e;\n  struct __pyx_memoryview_obj *p = (struct __pyx_memoryview_obj *)o;\n  if (p->obj) {\n    e = (*v)(p->obj, a); if (e) return e;\n  }\n  if (p->_size) {\n    e = (*v)(p->_size, a); if (e) return e;\n  }\n  if (p->_array_interface) {\n    e = (*v)(p->_array_interface, a); if (e) return e;\n  }\n  if (p->view.obj) {\n    e = (*v)(p->view.obj, a); if (e) return e;\n  }\n  return 0;\n}\n\nstatic int __pyx_tp_clear_memoryview(PyObject *o) {\n  PyObject* tmp;\n  struct __pyx_memoryview_obj *p = (struct __pyx_memoryview_obj *)o;\n  tmp = ((PyObject*)p->obj);\n  p->obj = Py_None; Py_INCREF(Py_None);\n  Py_XDECREF(tmp);\n  tmp = ((PyObject*)p->_size);\n  p->_size = Py_None; Py_INCREF(Py_None);\n  Py_XDECREF(tmp);\n  tmp = ((PyObject*)p->_array_interface);\n  p->_array_interface = Py_None; Py_INCREF(Py_None);\n  Py_XDECREF(tmp);\n  Py_CLEAR(p->view.obj);\n  return 0;\n}\nstatic PyObject *__pyx_sq_item_memoryview(PyObject *o, Py_ssize_t i) {\n  PyObject *r;\n  PyObject *x = PyInt_FromSsize_t(i); if(!x) return 0;\n  r = Py_TYPE(o)->tp_as_mapping->mp_subscript(o, x);\n  Py_DECREF(x);\n  return r;\n}\n\nstatic int __pyx_mp_ass_subscript_memoryview(PyObject *o, PyObject *i, PyObject *v) {\n  if (v) {\n    return __pyx_memoryview___setitem__(o, i, v);\n  }\n  else {\n    PyErr_Format(PyExc_NotImplementedError,\n      \"Subscript deletion not supported by %.200s\", Py_TYPE(o)->tp_name);\n    return -1;\n  }\n}\n\nstatic PyObject *__pyx_getprop___pyx_memoryview_T(PyObject *o, CYTHON_UNUSED void *x) {\n  return __pyx_pw_15View_dot_MemoryView_10memoryview_1T_1__get__(o);\n}\n\nstatic PyObject *__pyx_getprop___pyx_memoryview_base(PyObject *o, CYTHON_UNUSED void *x) {\n  return __pyx_pw_15View_dot_MemoryView_10memoryview_4base_1__get__(o);\n}\n\nstatic PyObject *__pyx_getprop___pyx_memoryview_shape(PyObject *o, CYTHON_UNUSED void *x) {\n  return __pyx_pw_15View_dot_MemoryView_10memoryview_5shape_1__get__(o);\n}\n\nstatic PyObject *__pyx_getprop___pyx_memoryview_strides(PyObject *o, CYTHON_UNUSED void *x) {\n  return __pyx_pw_15View_dot_MemoryView_10memoryview_7strides_1__get__(o);\n}\n\nstatic PyObject *__pyx_getprop___pyx_memoryview_suboffsets(PyObject *o, CYTHON_UNUSED void *x) {\n  return __pyx_pw_15View_dot_MemoryView_10memoryview_10suboffsets_1__get__(o);\n}\n\nstatic PyObject *__pyx_getprop___pyx_memoryview_ndim(PyObject *o, CYTHON_UNUSED void *x) {\n  return __pyx_pw_15View_dot_MemoryView_10memoryview_4ndim_1__get__(o);\n}\n\nstatic PyObject *__pyx_getprop___pyx_memoryview_itemsize(PyObject *o, CYTHON_UNUSED void *x) {\n  return __pyx_pw_15View_dot_MemoryView_10memoryview_8itemsize_1__get__(o);\n}\n\nstatic PyObject *__pyx_getprop___pyx_memoryview_nbytes(PyObject *o, CYTHON_UNUSED void *x) {\n  return __pyx_pw_15View_dot_MemoryView_10memoryview_6nbytes_1__get__(o);\n}\n\nstatic PyObject *__pyx_getprop___pyx_memoryview_size(PyObject *o, CYTHON_UNUSED void *x) {\n  return __pyx_pw_15View_dot_MemoryView_10memoryview_4size_1__get__(o);\n}\n\nstatic PyMethodDef __pyx_methods_memoryview[] = {\n  {\"is_c_contig\", (PyCFunction)__pyx_memoryview_is_c_contig, METH_NOARGS, 0},\n  {\"is_f_contig\", (PyCFunction)__pyx_memoryview_is_f_contig, METH_NOARGS, 0},\n  {\"copy\", (PyCFunction)__pyx_memoryview_copy, METH_NOARGS, 0},\n  {\"copy_fortran\", (PyCFunction)__pyx_memoryview_copy_fortran, METH_NOARGS, 0},\n  {\"__reduce_cython__\", (PyCFunction)__pyx_pw___pyx_memoryview_1__reduce_cython__, METH_NOARGS, 0},\n  {\"__setstate_cython__\", (PyCFunction)__pyx_pw___pyx_memoryview_3__setstate_cython__, METH_O, 0},\n  {0, 0, 0, 0}\n};\n\nstatic struct PyGetSetDef __pyx_getsets_memoryview[] = {\n  {(char *)\"T\", __pyx_getprop___pyx_memoryview_T, 0, (char *)0, 0},\n  {(char *)\"base\", __pyx_getprop___pyx_memoryview_base, 0, (char *)0, 0},\n  {(char *)\"shape\", __pyx_getprop___pyx_memoryview_shape, 0, (char *)0, 0},\n  {(char *)\"strides\", __pyx_getprop___pyx_memoryview_strides, 0, (char *)0, 0},\n  {(char *)\"suboffsets\", __pyx_getprop___pyx_memoryview_suboffsets, 0, (char *)0, 0},\n  {(char *)\"ndim\", __pyx_getprop___pyx_memoryview_ndim, 0, (char *)0, 0},\n  {(char *)\"itemsize\", __pyx_getprop___pyx_memoryview_itemsize, 0, (char *)0, 0},\n  {(char *)\"nbytes\", __pyx_getprop___pyx_memoryview_nbytes, 0, (char *)0, 0},\n  {(char *)\"size\", __pyx_getprop___pyx_memoryview_size, 0, (char *)0, 0},\n  {0, 0, 0, 0, 0}\n};\n\nstatic PySequenceMethods __pyx_tp_as_sequence_memoryview = {\n  __pyx_memoryview___len__, /*sq_length*/\n  0, /*sq_concat*/\n  0, /*sq_repeat*/\n  __pyx_sq_item_memoryview, /*sq_item*/\n  0, /*sq_slice*/\n  0, /*sq_ass_item*/\n  0, /*sq_ass_slice*/\n  0, /*sq_contains*/\n  0, /*sq_inplace_concat*/\n  0, /*sq_inplace_repeat*/\n};\n\nstatic PyMappingMethods __pyx_tp_as_mapping_memoryview = {\n  __pyx_memoryview___len__, /*mp_length*/\n  __pyx_memoryview___getitem__, /*mp_subscript*/\n  __pyx_mp_ass_subscript_memoryview, /*mp_ass_subscript*/\n};\n\nstatic PyBufferProcs __pyx_tp_as_buffer_memoryview = {\n  #if PY_MAJOR_VERSION < 3\n  0, /*bf_getreadbuffer*/\n  #endif\n  #if PY_MAJOR_VERSION < 3\n  0, /*bf_getwritebuffer*/\n  #endif\n  #if PY_MAJOR_VERSION < 3\n  0, /*bf_getsegcount*/\n  #endif\n  #if PY_MAJOR_VERSION < 3\n  0, /*bf_getcharbuffer*/\n  #endif\n  __pyx_memoryview_getbuffer, /*bf_getbuffer*/\n  0, /*bf_releasebuffer*/\n};\n\nstatic PyTypeObject __pyx_type___pyx_memoryview = {\n  PyVarObject_HEAD_INIT(0, 0)\n  \"TTS.tts.utils.monotonic_align.core.memoryview\", /*tp_name*/\n  sizeof(struct __pyx_memoryview_obj), /*tp_basicsize*/\n  0, /*tp_itemsize*/\n  __pyx_tp_dealloc_memoryview, /*tp_dealloc*/\n  #if PY_VERSION_HEX < 0x030800b4\n  0, /*tp_print*/\n  #endif\n  #if PY_VERSION_HEX >= 0x030800b4\n  0, /*tp_vectorcall_offset*/\n  #endif\n  0, /*tp_getattr*/\n  0, /*tp_setattr*/\n  #if PY_MAJOR_VERSION < 3\n  0, /*tp_compare*/\n  #endif\n  #if PY_MAJOR_VERSION >= 3\n  0, /*tp_as_async*/\n  #endif\n  __pyx_memoryview___repr__, /*tp_repr*/\n  0, /*tp_as_number*/\n  &__pyx_tp_as_sequence_memoryview, /*tp_as_sequence*/\n  &__pyx_tp_as_mapping_memoryview, /*tp_as_mapping*/\n  0, /*tp_hash*/\n  0, /*tp_call*/\n  __pyx_memoryview___str__, /*tp_str*/\n  0, /*tp_getattro*/\n  0, /*tp_setattro*/\n  &__pyx_tp_as_buffer_memoryview, /*tp_as_buffer*/\n  Py_TPFLAGS_DEFAULT|Py_TPFLAGS_HAVE_VERSION_TAG|Py_TPFLAGS_CHECKTYPES|Py_TPFLAGS_HAVE_NEWBUFFER|Py_TPFLAGS_BASETYPE|Py_TPFLAGS_HAVE_GC, /*tp_flags*/\n  0, /*tp_doc*/\n  __pyx_tp_traverse_memoryview, /*tp_traverse*/\n  __pyx_tp_clear_memoryview, /*tp_clear*/\n  0, /*tp_richcompare*/\n  0, /*tp_weaklistoffset*/\n  0, /*tp_iter*/\n  0, /*tp_iternext*/\n  __pyx_methods_memoryview, /*tp_methods*/\n  0, /*tp_members*/\n  __pyx_getsets_memoryview, /*tp_getset*/\n  0, /*tp_base*/\n  0, /*tp_dict*/\n  0, /*tp_descr_get*/\n  0, /*tp_descr_set*/\n  0, /*tp_dictoffset*/\n  0, /*tp_init*/\n  0, /*tp_alloc*/\n  __pyx_tp_new_memoryview, /*tp_new*/\n  0, /*tp_free*/\n  0, /*tp_is_gc*/\n  0, /*tp_bases*/\n  0, /*tp_mro*/\n  0, /*tp_cache*/\n  0, /*tp_subclasses*/\n  0, /*tp_weaklist*/\n  0, /*tp_del*/\n  0, /*tp_version_tag*/\n  #if PY_VERSION_HEX >= 0x030400a1\n  0, /*tp_finalize*/\n  #endif\n  #if PY_VERSION_HEX >= 0x030800b1 && (!CYTHON_COMPILING_IN_PYPY || PYPY_VERSION_NUM >= 0x07030800)\n  0, /*tp_vectorcall*/\n  #endif\n  #if PY_VERSION_HEX >= 0x030800b4 && PY_VERSION_HEX < 0x03090000\n  0, /*tp_print*/\n  #endif\n  #if CYTHON_COMPILING_IN_PYPY && PY_VERSION_HEX >= 0x03090000\n  0, /*tp_pypy_flags*/\n  #endif\n};\nstatic struct __pyx_vtabstruct__memoryviewslice __pyx_vtable__memoryviewslice;\n\nstatic PyObject *__pyx_tp_new__memoryviewslice(PyTypeObject *t, PyObject *a, PyObject *k) {\n  struct __pyx_memoryviewslice_obj *p;\n  PyObject *o = __pyx_tp_new_memoryview(t, a, k);\n  if (unlikely(!o)) return 0;\n  p = ((struct __pyx_memoryviewslice_obj *)o);\n  p->__pyx_base.__pyx_vtab = (struct __pyx_vtabstruct_memoryview*)__pyx_vtabptr__memoryviewslice;\n  p->from_object = Py_None; Py_INCREF(Py_None);\n  p->from_slice.memview = NULL;\n  return o;\n}\n\nstatic void __pyx_tp_dealloc__memoryviewslice(PyObject *o) {\n  struct __pyx_memoryviewslice_obj *p = (struct __pyx_memoryviewslice_obj *)o;\n  #if CYTHON_USE_TP_FINALIZE\n  if (unlikely(PyType_HasFeature(Py_TYPE(o), Py_TPFLAGS_HAVE_FINALIZE) && Py_TYPE(o)->tp_finalize) && !_PyGC_FINALIZED(o)) {\n    if (PyObject_CallFinalizerFromDealloc(o)) return;\n  }\n  #endif\n  PyObject_GC_UnTrack(o);\n  {\n    PyObject *etype, *eval, *etb;\n    PyErr_Fetch(&etype, &eval, &etb);\n    __Pyx_SET_REFCNT(o, Py_REFCNT(o) + 1);\n    __pyx_memoryviewslice___dealloc__(o);\n    __Pyx_SET_REFCNT(o, Py_REFCNT(o) - 1);\n    PyErr_Restore(etype, eval, etb);\n  }\n  Py_CLEAR(p->from_object);\n  PyObject_GC_Track(o);\n  __pyx_tp_dealloc_memoryview(o);\n}\n\nstatic int __pyx_tp_traverse__memoryviewslice(PyObject *o, visitproc v, void *a) {\n  int e;\n  struct __pyx_memoryviewslice_obj *p = (struct __pyx_memoryviewslice_obj *)o;\n  e = __pyx_tp_traverse_memoryview(o, v, a); if (e) return e;\n  if (p->from_object) {\n    e = (*v)(p->from_object, a); if (e) return e;\n  }\n  return 0;\n}\n\nstatic int __pyx_tp_clear__memoryviewslice(PyObject *o) {\n  PyObject* tmp;\n  struct __pyx_memoryviewslice_obj *p = (struct __pyx_memoryviewslice_obj *)o;\n  __pyx_tp_clear_memoryview(o);\n  tmp = ((PyObject*)p->from_object);\n  p->from_object = Py_None; Py_INCREF(Py_None);\n  Py_XDECREF(tmp);\n  __PYX_XDEC_MEMVIEW(&p->from_slice, 1);\n  return 0;\n}\n\nstatic PyObject *__pyx_getprop___pyx_memoryviewslice_base(PyObject *o, CYTHON_UNUSED void *x) {\n  return __pyx_pw_15View_dot_MemoryView_16_memoryviewslice_4base_1__get__(o);\n}\n\nstatic PyMethodDef __pyx_methods__memoryviewslice[] = {\n  {\"__reduce_cython__\", (PyCFunction)__pyx_pw___pyx_memoryviewslice_1__reduce_cython__, METH_NOARGS, 0},\n  {\"__setstate_cython__\", (PyCFunction)__pyx_pw___pyx_memoryviewslice_3__setstate_cython__, METH_O, 0},\n  {0, 0, 0, 0}\n};\n\nstatic struct PyGetSetDef __pyx_getsets__memoryviewslice[] = {\n  {(char *)\"base\", __pyx_getprop___pyx_memoryviewslice_base, 0, (char *)0, 0},\n  {0, 0, 0, 0, 0}\n};\n\nstatic PyTypeObject __pyx_type___pyx_memoryviewslice = {\n  PyVarObject_HEAD_INIT(0, 0)\n  \"TTS.tts.utils.monotonic_align.core._memoryviewslice\", /*tp_name*/\n  sizeof(struct __pyx_memoryviewslice_obj), /*tp_basicsize*/\n  0, /*tp_itemsize*/\n  __pyx_tp_dealloc__memoryviewslice, /*tp_dealloc*/\n  #if PY_VERSION_HEX < 0x030800b4\n  0, /*tp_print*/\n  #endif\n  #if PY_VERSION_HEX >= 0x030800b4\n  0, /*tp_vectorcall_offset*/\n  #endif\n  0, /*tp_getattr*/\n  0, /*tp_setattr*/\n  #if PY_MAJOR_VERSION < 3\n  0, /*tp_compare*/\n  #endif\n  #if PY_MAJOR_VERSION >= 3\n  0, /*tp_as_async*/\n  #endif\n  #if CYTHON_COMPILING_IN_PYPY\n  __pyx_memoryview___repr__, /*tp_repr*/\n  #else\n  0, /*tp_repr*/\n  #endif\n  0, /*tp_as_number*/\n  0, /*tp_as_sequence*/\n  0, /*tp_as_mapping*/\n  0, /*tp_hash*/\n  0, /*tp_call*/\n  #if CYTHON_COMPILING_IN_PYPY\n  __pyx_memoryview___str__, /*tp_str*/\n  #else\n  0, /*tp_str*/\n  #endif\n  0, /*tp_getattro*/\n  0, /*tp_setattro*/\n  0, /*tp_as_buffer*/\n  Py_TPFLAGS_DEFAULT|Py_TPFLAGS_HAVE_VERSION_TAG|Py_TPFLAGS_CHECKTYPES|Py_TPFLAGS_HAVE_NEWBUFFER|Py_TPFLAGS_BASETYPE|Py_TPFLAGS_HAVE_GC, /*tp_flags*/\n  \"Internal class for passing memoryview slices to Python\", /*tp_doc*/\n  __pyx_tp_traverse__memoryviewslice, /*tp_traverse*/\n  __pyx_tp_clear__memoryviewslice, /*tp_clear*/\n  0, /*tp_richcompare*/\n  0, /*tp_weaklistoffset*/\n  0, /*tp_iter*/\n  0, /*tp_iternext*/\n  __pyx_methods__memoryviewslice, /*tp_methods*/\n  0, /*tp_members*/\n  __pyx_getsets__memoryviewslice, /*tp_getset*/\n  0, /*tp_base*/\n  0, /*tp_dict*/\n  0, /*tp_descr_get*/\n  0, /*tp_descr_set*/\n  0, /*tp_dictoffset*/\n  0, /*tp_init*/\n  0, /*tp_alloc*/\n  __pyx_tp_new__memoryviewslice, /*tp_new*/\n  0, /*tp_free*/\n  0, /*tp_is_gc*/\n  0, /*tp_bases*/\n  0, /*tp_mro*/\n  0, /*tp_cache*/\n  0, /*tp_subclasses*/\n  0, /*tp_weaklist*/\n  0, /*tp_del*/\n  0, /*tp_version_tag*/\n  #if PY_VERSION_HEX >= 0x030400a1\n  0, /*tp_finalize*/\n  #endif\n  #if PY_VERSION_HEX >= 0x030800b1 && (!CYTHON_COMPILING_IN_PYPY || PYPY_VERSION_NUM >= 0x07030800)\n  0, /*tp_vectorcall*/\n  #endif\n  #if PY_VERSION_HEX >= 0x030800b4 && PY_VERSION_HEX < 0x03090000\n  0, /*tp_print*/\n  #endif\n  #if CYTHON_COMPILING_IN_PYPY && PY_VERSION_HEX >= 0x03090000\n  0, /*tp_pypy_flags*/\n  #endif\n};\n\nstatic PyMethodDef __pyx_methods[] = {\n  {\"maximum_path_c\", (PyCFunction)(void*)(PyCFunctionWithKeywords)__pyx_pw_3TTS_3tts_5utils_15monotonic_align_4core_1maximum_path_c, METH_VARARGS|METH_KEYWORDS, 0},\n  {0, 0, 0, 0}\n};\n\n#if PY_MAJOR_VERSION >= 3\n#if CYTHON_PEP489_MULTI_PHASE_INIT\nstatic PyObject* __pyx_pymod_create(PyObject *spec, PyModuleDef *def); /*proto*/\nstatic int __pyx_pymod_exec_core(PyObject* module); /*proto*/\nstatic PyModuleDef_Slot __pyx_moduledef_slots[] = {\n  {Py_mod_create, (void*)__pyx_pymod_create},\n  {Py_mod_exec, (void*)__pyx_pymod_exec_core},\n  {0, NULL}\n};\n#endif\n\nstatic struct PyModuleDef __pyx_moduledef = {\n    PyModuleDef_HEAD_INIT,\n    \"core\",\n    0, /* m_doc */\n  #if CYTHON_PEP489_MULTI_PHASE_INIT\n    0, /* m_size */\n  #else\n    -1, /* m_size */\n  #endif\n    __pyx_methods /* m_methods */,\n  #if CYTHON_PEP489_MULTI_PHASE_INIT\n    __pyx_moduledef_slots, /* m_slots */\n  #else\n    NULL, /* m_reload */\n  #endif\n    NULL, /* m_traverse */\n    NULL, /* m_clear */\n    NULL /* m_free */\n};\n#endif\n#ifndef CYTHON_SMALL_CODE\n#if defined(__clang__)\n    #define CYTHON_SMALL_CODE\n#elif defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3))\n    #define CYTHON_SMALL_CODE __attribute__((cold))\n#else\n    #define CYTHON_SMALL_CODE\n#endif\n#endif\n\nstatic __Pyx_StringTabEntry __pyx_string_tab[] = {\n  {&__pyx_n_s_ASCII, __pyx_k_ASCII, sizeof(__pyx_k_ASCII), 0, 0, 1, 1},\n  {&__pyx_kp_s_Buffer_view_does_not_expose_stri, __pyx_k_Buffer_view_does_not_expose_stri, sizeof(__pyx_k_Buffer_view_does_not_expose_stri), 0, 0, 1, 0},\n  {&__pyx_kp_s_Can_only_create_a_buffer_that_is, __pyx_k_Can_only_create_a_buffer_that_is, sizeof(__pyx_k_Can_only_create_a_buffer_that_is), 0, 0, 1, 0},\n  {&__pyx_kp_s_Cannot_assign_to_read_only_memor, __pyx_k_Cannot_assign_to_read_only_memor, sizeof(__pyx_k_Cannot_assign_to_read_only_memor), 0, 0, 1, 0},\n  {&__pyx_kp_s_Cannot_create_writable_memory_vi, __pyx_k_Cannot_create_writable_memory_vi, sizeof(__pyx_k_Cannot_create_writable_memory_vi), 0, 0, 1, 0},\n  {&__pyx_kp_s_Cannot_index_with_type_s, __pyx_k_Cannot_index_with_type_s, sizeof(__pyx_k_Cannot_index_with_type_s), 0, 0, 1, 0},\n  {&__pyx_n_s_Ellipsis, __pyx_k_Ellipsis, sizeof(__pyx_k_Ellipsis), 0, 0, 1, 1},\n  {&__pyx_kp_s_Empty_shape_tuple_for_cython_arr, __pyx_k_Empty_shape_tuple_for_cython_arr, sizeof(__pyx_k_Empty_shape_tuple_for_cython_arr), 0, 0, 1, 0},\n  {&__pyx_n_s_ImportError, __pyx_k_ImportError, sizeof(__pyx_k_ImportError), 0, 0, 1, 1},\n  {&__pyx_kp_s_Incompatible_checksums_s_vs_0xb0, __pyx_k_Incompatible_checksums_s_vs_0xb0, sizeof(__pyx_k_Incompatible_checksums_s_vs_0xb0), 0, 0, 1, 0},\n  {&__pyx_n_s_IndexError, __pyx_k_IndexError, sizeof(__pyx_k_IndexError), 0, 0, 1, 1},\n  {&__pyx_kp_s_Indirect_dimensions_not_supporte, __pyx_k_Indirect_dimensions_not_supporte, sizeof(__pyx_k_Indirect_dimensions_not_supporte), 0, 0, 1, 0},\n  {&__pyx_kp_s_Invalid_mode_expected_c_or_fortr, __pyx_k_Invalid_mode_expected_c_or_fortr, sizeof(__pyx_k_Invalid_mode_expected_c_or_fortr), 0, 0, 1, 0},\n  {&__pyx_kp_s_Invalid_shape_in_axis_d_d, __pyx_k_Invalid_shape_in_axis_d_d, sizeof(__pyx_k_Invalid_shape_in_axis_d_d), 0, 0, 1, 0},\n  {&__pyx_n_s_MemoryError, __pyx_k_MemoryError, sizeof(__pyx_k_MemoryError), 0, 0, 1, 1},\n  {&__pyx_kp_s_MemoryView_of_r_at_0x_x, __pyx_k_MemoryView_of_r_at_0x_x, sizeof(__pyx_k_MemoryView_of_r_at_0x_x), 0, 0, 1, 0},\n  {&__pyx_kp_s_MemoryView_of_r_object, __pyx_k_MemoryView_of_r_object, sizeof(__pyx_k_MemoryView_of_r_object), 0, 0, 1, 0},\n  {&__pyx_n_b_O, __pyx_k_O, sizeof(__pyx_k_O), 0, 0, 0, 1},\n  {&__pyx_kp_s_Out_of_bounds_on_buffer_access_a, __pyx_k_Out_of_bounds_on_buffer_access_a, sizeof(__pyx_k_Out_of_bounds_on_buffer_access_a), 0, 0, 1, 0},\n  {&__pyx_n_s_PickleError, __pyx_k_PickleError, sizeof(__pyx_k_PickleError), 0, 0, 1, 1},\n  {&__pyx_n_s_TypeError, __pyx_k_TypeError, sizeof(__pyx_k_TypeError), 0, 0, 1, 1},\n  {&__pyx_kp_s_Unable_to_convert_item_to_object, __pyx_k_Unable_to_convert_item_to_object, sizeof(__pyx_k_Unable_to_convert_item_to_object), 0, 0, 1, 0},\n  {&__pyx_n_s_ValueError, __pyx_k_ValueError, sizeof(__pyx_k_ValueError), 0, 0, 1, 1},\n  {&__pyx_n_s_View_MemoryView, __pyx_k_View_MemoryView, sizeof(__pyx_k_View_MemoryView), 0, 0, 1, 1},\n  {&__pyx_n_s_allocate_buffer, __pyx_k_allocate_buffer, sizeof(__pyx_k_allocate_buffer), 0, 0, 1, 1},\n  {&__pyx_n_s_base, __pyx_k_base, sizeof(__pyx_k_base), 0, 0, 1, 1},\n  {&__pyx_n_s_c, __pyx_k_c, sizeof(__pyx_k_c), 0, 0, 1, 1},\n  {&__pyx_n_u_c, __pyx_k_c, sizeof(__pyx_k_c), 0, 1, 0, 1},\n  {&__pyx_n_s_class, __pyx_k_class, sizeof(__pyx_k_class), 0, 0, 1, 1},\n  {&__pyx_n_s_cline_in_traceback, __pyx_k_cline_in_traceback, sizeof(__pyx_k_cline_in_traceback), 0, 0, 1, 1},\n  {&__pyx_kp_s_contiguous_and_direct, __pyx_k_contiguous_and_direct, sizeof(__pyx_k_contiguous_and_direct), 0, 0, 1, 0},\n  {&__pyx_kp_s_contiguous_and_indirect, __pyx_k_contiguous_and_indirect, sizeof(__pyx_k_contiguous_and_indirect), 0, 0, 1, 0},\n  {&__pyx_n_s_dict, __pyx_k_dict, sizeof(__pyx_k_dict), 0, 0, 1, 1},\n  {&__pyx_n_s_dtype_is_object, __pyx_k_dtype_is_object, sizeof(__pyx_k_dtype_is_object), 0, 0, 1, 1},\n  {&__pyx_n_s_encode, __pyx_k_encode, sizeof(__pyx_k_encode), 0, 0, 1, 1},\n  {&__pyx_n_s_enumerate, __pyx_k_enumerate, sizeof(__pyx_k_enumerate), 0, 0, 1, 1},\n  {&__pyx_n_s_error, __pyx_k_error, sizeof(__pyx_k_error), 0, 0, 1, 1},\n  {&__pyx_n_s_flags, __pyx_k_flags, sizeof(__pyx_k_flags), 0, 0, 1, 1},\n  {&__pyx_n_s_format, __pyx_k_format, sizeof(__pyx_k_format), 0, 0, 1, 1},\n  {&__pyx_n_s_fortran, __pyx_k_fortran, sizeof(__pyx_k_fortran), 0, 0, 1, 1},\n  {&__pyx_n_u_fortran, __pyx_k_fortran, sizeof(__pyx_k_fortran), 0, 1, 0, 1},\n  {&__pyx_n_s_getstate, __pyx_k_getstate, sizeof(__pyx_k_getstate), 0, 0, 1, 1},\n  {&__pyx_kp_s_got_differing_extents_in_dimensi, __pyx_k_got_differing_extents_in_dimensi, sizeof(__pyx_k_got_differing_extents_in_dimensi), 0, 0, 1, 0},\n  {&__pyx_n_s_id, __pyx_k_id, sizeof(__pyx_k_id), 0, 0, 1, 1},\n  {&__pyx_n_s_import, __pyx_k_import, sizeof(__pyx_k_import), 0, 0, 1, 1},\n  {&__pyx_n_s_itemsize, __pyx_k_itemsize, sizeof(__pyx_k_itemsize), 0, 0, 1, 1},\n  {&__pyx_kp_s_itemsize_0_for_cython_array, __pyx_k_itemsize_0_for_cython_array, sizeof(__pyx_k_itemsize_0_for_cython_array), 0, 0, 1, 0},\n  {&__pyx_n_s_main, __pyx_k_main, sizeof(__pyx_k_main), 0, 0, 1, 1},\n  {&__pyx_n_s_max_neg_val, __pyx_k_max_neg_val, sizeof(__pyx_k_max_neg_val), 0, 0, 1, 1},\n  {&__pyx_n_s_memview, __pyx_k_memview, sizeof(__pyx_k_memview), 0, 0, 1, 1},\n  {&__pyx_n_s_mode, __pyx_k_mode, sizeof(__pyx_k_mode), 0, 0, 1, 1},\n  {&__pyx_n_s_name, __pyx_k_name, sizeof(__pyx_k_name), 0, 0, 1, 1},\n  {&__pyx_n_s_name_2, __pyx_k_name_2, sizeof(__pyx_k_name_2), 0, 0, 1, 1},\n  {&__pyx_n_s_ndim, __pyx_k_ndim, sizeof(__pyx_k_ndim), 0, 0, 1, 1},\n  {&__pyx_n_s_new, __pyx_k_new, sizeof(__pyx_k_new), 0, 0, 1, 1},\n  {&__pyx_kp_s_no_default___reduce___due_to_non, __pyx_k_no_default___reduce___due_to_non, sizeof(__pyx_k_no_default___reduce___due_to_non), 0, 0, 1, 0},\n  {&__pyx_n_s_np, __pyx_k_np, sizeof(__pyx_k_np), 0, 0, 1, 1},\n  {&__pyx_n_s_numpy, __pyx_k_numpy, sizeof(__pyx_k_numpy), 0, 0, 1, 1},\n  {&__pyx_kp_u_numpy_core_multiarray_failed_to, __pyx_k_numpy_core_multiarray_failed_to, sizeof(__pyx_k_numpy_core_multiarray_failed_to), 0, 1, 0, 0},\n  {&__pyx_kp_u_numpy_core_umath_failed_to_impor, __pyx_k_numpy_core_umath_failed_to_impor, sizeof(__pyx_k_numpy_core_umath_failed_to_impor), 0, 1, 0, 0},\n  {&__pyx_n_s_obj, __pyx_k_obj, sizeof(__pyx_k_obj), 0, 0, 1, 1},\n  {&__pyx_n_s_pack, __pyx_k_pack, sizeof(__pyx_k_pack), 0, 0, 1, 1},\n  {&__pyx_n_s_paths, __pyx_k_paths, sizeof(__pyx_k_paths), 0, 0, 1, 1},\n  {&__pyx_n_s_pickle, __pyx_k_pickle, sizeof(__pyx_k_pickle), 0, 0, 1, 1},\n  {&__pyx_n_s_pyx_PickleError, __pyx_k_pyx_PickleError, sizeof(__pyx_k_pyx_PickleError), 0, 0, 1, 1},\n  {&__pyx_n_s_pyx_checksum, __pyx_k_pyx_checksum, sizeof(__pyx_k_pyx_checksum), 0, 0, 1, 1},\n  {&__pyx_n_s_pyx_getbuffer, __pyx_k_pyx_getbuffer, sizeof(__pyx_k_pyx_getbuffer), 0, 0, 1, 1},\n  {&__pyx_n_s_pyx_result, __pyx_k_pyx_result, sizeof(__pyx_k_pyx_result), 0, 0, 1, 1},\n  {&__pyx_n_s_pyx_state, __pyx_k_pyx_state, sizeof(__pyx_k_pyx_state), 0, 0, 1, 1},\n  {&__pyx_n_s_pyx_type, __pyx_k_pyx_type, sizeof(__pyx_k_pyx_type), 0, 0, 1, 1},\n  {&__pyx_n_s_pyx_unpickle_Enum, __pyx_k_pyx_unpickle_Enum, sizeof(__pyx_k_pyx_unpickle_Enum), 0, 0, 1, 1},\n  {&__pyx_n_s_pyx_vtable, __pyx_k_pyx_vtable, sizeof(__pyx_k_pyx_vtable), 0, 0, 1, 1},\n  {&__pyx_n_s_range, __pyx_k_range, sizeof(__pyx_k_range), 0, 0, 1, 1},\n  {&__pyx_n_s_reduce, __pyx_k_reduce, sizeof(__pyx_k_reduce), 0, 0, 1, 1},\n  {&__pyx_n_s_reduce_cython, __pyx_k_reduce_cython, sizeof(__pyx_k_reduce_cython), 0, 0, 1, 1},\n  {&__pyx_n_s_reduce_ex, __pyx_k_reduce_ex, sizeof(__pyx_k_reduce_ex), 0, 0, 1, 1},\n  {&__pyx_n_s_setstate, __pyx_k_setstate, sizeof(__pyx_k_setstate), 0, 0, 1, 1},\n  {&__pyx_n_s_setstate_cython, __pyx_k_setstate_cython, sizeof(__pyx_k_setstate_cython), 0, 0, 1, 1},\n  {&__pyx_n_s_shape, __pyx_k_shape, sizeof(__pyx_k_shape), 0, 0, 1, 1},\n  {&__pyx_n_s_size, __pyx_k_size, sizeof(__pyx_k_size), 0, 0, 1, 1},\n  {&__pyx_n_s_start, __pyx_k_start, sizeof(__pyx_k_start), 0, 0, 1, 1},\n  {&__pyx_n_s_step, __pyx_k_step, sizeof(__pyx_k_step), 0, 0, 1, 1},\n  {&__pyx_n_s_stop, __pyx_k_stop, sizeof(__pyx_k_stop), 0, 0, 1, 1},\n  {&__pyx_kp_s_strided_and_direct, __pyx_k_strided_and_direct, sizeof(__pyx_k_strided_and_direct), 0, 0, 1, 0},\n  {&__pyx_kp_s_strided_and_direct_or_indirect, __pyx_k_strided_and_direct_or_indirect, sizeof(__pyx_k_strided_and_direct_or_indirect), 0, 0, 1, 0},\n  {&__pyx_kp_s_strided_and_indirect, __pyx_k_strided_and_indirect, sizeof(__pyx_k_strided_and_indirect), 0, 0, 1, 0},\n  {&__pyx_kp_s_stringsource, __pyx_k_stringsource, sizeof(__pyx_k_stringsource), 0, 0, 1, 0},\n  {&__pyx_n_s_struct, __pyx_k_struct, sizeof(__pyx_k_struct), 0, 0, 1, 1},\n  {&__pyx_n_s_t_xs, __pyx_k_t_xs, sizeof(__pyx_k_t_xs), 0, 0, 1, 1},\n  {&__pyx_n_s_t_ys, __pyx_k_t_ys, sizeof(__pyx_k_t_ys), 0, 0, 1, 1},\n  {&__pyx_n_s_test, __pyx_k_test, sizeof(__pyx_k_test), 0, 0, 1, 1},\n  {&__pyx_kp_s_unable_to_allocate_array_data, __pyx_k_unable_to_allocate_array_data, sizeof(__pyx_k_unable_to_allocate_array_data), 0, 0, 1, 0},\n  {&__pyx_kp_s_unable_to_allocate_shape_and_str, __pyx_k_unable_to_allocate_shape_and_str, sizeof(__pyx_k_unable_to_allocate_shape_and_str), 0, 0, 1, 0},\n  {&__pyx_n_s_unpack, __pyx_k_unpack, sizeof(__pyx_k_unpack), 0, 0, 1, 1},\n  {&__pyx_n_s_update, __pyx_k_update, sizeof(__pyx_k_update), 0, 0, 1, 1},\n  {&__pyx_n_s_values, __pyx_k_values, sizeof(__pyx_k_values), 0, 0, 1, 1},\n  {0, 0, 0, 0, 0, 0, 0}\n};\nstatic CYTHON_SMALL_CODE int __Pyx_InitCachedBuiltins(void) {\n  __pyx_builtin_range = __Pyx_GetBuiltinName(__pyx_n_s_range); if (!__pyx_builtin_range) __PYX_ERR(0, 19, __pyx_L1_error)\n  __pyx_builtin_ImportError = __Pyx_GetBuiltinName(__pyx_n_s_ImportError); if (!__pyx_builtin_ImportError) __PYX_ERR(1, 945, __pyx_L1_error)\n  __pyx_builtin_ValueError = __Pyx_GetBuiltinName(__pyx_n_s_ValueError); if (!__pyx_builtin_ValueError) __PYX_ERR(2, 133, __pyx_L1_error)\n  __pyx_builtin_MemoryError = __Pyx_GetBuiltinName(__pyx_n_s_MemoryError); if (!__pyx_builtin_MemoryError) __PYX_ERR(2, 148, __pyx_L1_error)\n  __pyx_builtin_enumerate = __Pyx_GetBuiltinName(__pyx_n_s_enumerate); if (!__pyx_builtin_enumerate) __PYX_ERR(2, 151, __pyx_L1_error)\n  __pyx_builtin_TypeError = __Pyx_GetBuiltinName(__pyx_n_s_TypeError); if (!__pyx_builtin_TypeError) __PYX_ERR(2, 2, __pyx_L1_error)\n  __pyx_builtin_Ellipsis = __Pyx_GetBuiltinName(__pyx_n_s_Ellipsis); if (!__pyx_builtin_Ellipsis) __PYX_ERR(2, 404, __pyx_L1_error)\n  __pyx_builtin_id = __Pyx_GetBuiltinName(__pyx_n_s_id); if (!__pyx_builtin_id) __PYX_ERR(2, 613, __pyx_L1_error)\n  __pyx_builtin_IndexError = __Pyx_GetBuiltinName(__pyx_n_s_IndexError); if (!__pyx_builtin_IndexError) __PYX_ERR(2, 832, __pyx_L1_error)\n  return 0;\n  __pyx_L1_error:;\n  return -1;\n}\n\nstatic CYTHON_SMALL_CODE int __Pyx_InitCachedConstants(void) {\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__Pyx_InitCachedConstants\", 0);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":945\n *         __pyx_import_array()\n *     except Exception:\n *         raise ImportError(\"numpy.core.multiarray failed to import\")             # <<<<<<<<<<<<<<\n * \n * cdef inline int import_umath() except -1:\n */\n  __pyx_tuple__2 = PyTuple_Pack(1, __pyx_kp_u_numpy_core_multiarray_failed_to); if (unlikely(!__pyx_tuple__2)) __PYX_ERR(1, 945, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__2);\n  __Pyx_GIVEREF(__pyx_tuple__2);\n\n  /* \"C:/Users/GIANMA~1/AppData/Local/Temp/pip-build-env-eetufkp7/overlay/Lib/site-packages/numpy/__init__.pxd\":951\n *         _import_umath()\n *     except Exception:\n *         raise ImportError(\"numpy.core.umath failed to import\")             # <<<<<<<<<<<<<<\n * \n * cdef inline int import_ufunc() except -1:\n */\n  __pyx_tuple__3 = PyTuple_Pack(1, __pyx_kp_u_numpy_core_umath_failed_to_impor); if (unlikely(!__pyx_tuple__3)) __PYX_ERR(1, 951, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__3);\n  __Pyx_GIVEREF(__pyx_tuple__3);\n\n  /* \"View.MemoryView\":133\n * \n *         if not self.ndim:\n *             raise ValueError(\"Empty shape tuple for cython.array\")             # <<<<<<<<<<<<<<\n * \n *         if itemsize <= 0:\n */\n  __pyx_tuple__4 = PyTuple_Pack(1, __pyx_kp_s_Empty_shape_tuple_for_cython_arr); if (unlikely(!__pyx_tuple__4)) __PYX_ERR(2, 133, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__4);\n  __Pyx_GIVEREF(__pyx_tuple__4);\n\n  /* \"View.MemoryView\":136\n * \n *         if itemsize <= 0:\n *             raise ValueError(\"itemsize <= 0 for cython.array\")             # <<<<<<<<<<<<<<\n * \n *         if not isinstance(format, bytes):\n */\n  __pyx_tuple__5 = PyTuple_Pack(1, __pyx_kp_s_itemsize_0_for_cython_array); if (unlikely(!__pyx_tuple__5)) __PYX_ERR(2, 136, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__5);\n  __Pyx_GIVEREF(__pyx_tuple__5);\n\n  /* \"View.MemoryView\":148\n * \n *         if not self._shape:\n *             raise MemoryError(\"unable to allocate shape and strides.\")             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_tuple__6 = PyTuple_Pack(1, __pyx_kp_s_unable_to_allocate_shape_and_str); if (unlikely(!__pyx_tuple__6)) __PYX_ERR(2, 148, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__6);\n  __Pyx_GIVEREF(__pyx_tuple__6);\n\n  /* \"View.MemoryView\":176\n *             self.data = <char *>malloc(self.len)\n *             if not self.data:\n *                 raise MemoryError(\"unable to allocate array data.\")             # <<<<<<<<<<<<<<\n * \n *             if self.dtype_is_object:\n */\n  __pyx_tuple__7 = PyTuple_Pack(1, __pyx_kp_s_unable_to_allocate_array_data); if (unlikely(!__pyx_tuple__7)) __PYX_ERR(2, 176, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__7);\n  __Pyx_GIVEREF(__pyx_tuple__7);\n\n  /* \"View.MemoryView\":192\n *             bufmode = PyBUF_F_CONTIGUOUS | PyBUF_ANY_CONTIGUOUS\n *         if not (flags & bufmode):\n *             raise ValueError(\"Can only create a buffer that is contiguous in memory.\")             # <<<<<<<<<<<<<<\n *         info.buf = self.data\n *         info.len = self.len\n */\n  __pyx_tuple__8 = PyTuple_Pack(1, __pyx_kp_s_Can_only_create_a_buffer_that_is); if (unlikely(!__pyx_tuple__8)) __PYX_ERR(2, 192, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__8);\n  __Pyx_GIVEREF(__pyx_tuple__8);\n\n  /* \"(tree fragment)\":2\n * def __reduce_cython__(self):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")             # <<<<<<<<<<<<<<\n * def __setstate_cython__(self, __pyx_state):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n */\n  __pyx_tuple__9 = PyTuple_Pack(1, __pyx_kp_s_no_default___reduce___due_to_non); if (unlikely(!__pyx_tuple__9)) __PYX_ERR(2, 2, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__9);\n  __Pyx_GIVEREF(__pyx_tuple__9);\n\n  /* \"(tree fragment)\":4\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")             # <<<<<<<<<<<<<<\n */\n  __pyx_tuple__10 = PyTuple_Pack(1, __pyx_kp_s_no_default___reduce___due_to_non); if (unlikely(!__pyx_tuple__10)) __PYX_ERR(2, 4, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__10);\n  __Pyx_GIVEREF(__pyx_tuple__10);\n\n  /* \"View.MemoryView\":418\n *     def __setitem__(memoryview self, object index, object value):\n *         if self.view.readonly:\n *             raise TypeError(\"Cannot assign to read-only memoryview\")             # <<<<<<<<<<<<<<\n * \n *         have_slices, index = _unellipsify(index, self.view.ndim)\n */\n  __pyx_tuple__11 = PyTuple_Pack(1, __pyx_kp_s_Cannot_assign_to_read_only_memor); if (unlikely(!__pyx_tuple__11)) __PYX_ERR(2, 418, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__11);\n  __Pyx_GIVEREF(__pyx_tuple__11);\n\n  /* \"View.MemoryView\":495\n *             result = struct.unpack(self.view.format, bytesitem)\n *         except struct.error:\n *             raise ValueError(\"Unable to convert item to object\")             # <<<<<<<<<<<<<<\n *         else:\n *             if len(self.view.format) == 1:\n */\n  __pyx_tuple__12 = PyTuple_Pack(1, __pyx_kp_s_Unable_to_convert_item_to_object); if (unlikely(!__pyx_tuple__12)) __PYX_ERR(2, 495, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__12);\n  __Pyx_GIVEREF(__pyx_tuple__12);\n\n  /* \"View.MemoryView\":520\n *     def __getbuffer__(self, Py_buffer *info, int flags):\n *         if flags & PyBUF_WRITABLE and self.view.readonly:\n *             raise ValueError(\"Cannot create writable memory view from read-only memoryview\")             # <<<<<<<<<<<<<<\n * \n *         if flags & PyBUF_ND:\n */\n  __pyx_tuple__13 = PyTuple_Pack(1, __pyx_kp_s_Cannot_create_writable_memory_vi); if (unlikely(!__pyx_tuple__13)) __PYX_ERR(2, 520, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__13);\n  __Pyx_GIVEREF(__pyx_tuple__13);\n\n  /* \"View.MemoryView\":570\n *         if self.view.strides == NULL:\n * \n *             raise ValueError(\"Buffer view does not expose strides\")             # <<<<<<<<<<<<<<\n * \n *         return tuple([stride for stride in self.view.strides[:self.view.ndim]])\n */\n  __pyx_tuple__14 = PyTuple_Pack(1, __pyx_kp_s_Buffer_view_does_not_expose_stri); if (unlikely(!__pyx_tuple__14)) __PYX_ERR(2, 570, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__14);\n  __Pyx_GIVEREF(__pyx_tuple__14);\n\n  /* \"View.MemoryView\":577\n *     def suboffsets(self):\n *         if self.view.suboffsets == NULL:\n *             return (-1,) * self.view.ndim             # <<<<<<<<<<<<<<\n * \n *         return tuple([suboffset for suboffset in self.view.suboffsets[:self.view.ndim]])\n */\n  __pyx_tuple__15 = PyTuple_New(1); if (unlikely(!__pyx_tuple__15)) __PYX_ERR(2, 577, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__15);\n  __Pyx_INCREF(__pyx_int_neg_1);\n  __Pyx_GIVEREF(__pyx_int_neg_1);\n  PyTuple_SET_ITEM(__pyx_tuple__15, 0, __pyx_int_neg_1);\n  __Pyx_GIVEREF(__pyx_tuple__15);\n\n  /* \"(tree fragment)\":2\n * def __reduce_cython__(self):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")             # <<<<<<<<<<<<<<\n * def __setstate_cython__(self, __pyx_state):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n */\n  __pyx_tuple__16 = PyTuple_Pack(1, __pyx_kp_s_no_default___reduce___due_to_non); if (unlikely(!__pyx_tuple__16)) __PYX_ERR(2, 2, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__16);\n  __Pyx_GIVEREF(__pyx_tuple__16);\n\n  /* \"(tree fragment)\":4\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")             # <<<<<<<<<<<<<<\n */\n  __pyx_tuple__17 = PyTuple_Pack(1, __pyx_kp_s_no_default___reduce___due_to_non); if (unlikely(!__pyx_tuple__17)) __PYX_ERR(2, 4, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__17);\n  __Pyx_GIVEREF(__pyx_tuple__17);\n\n  /* \"View.MemoryView\":682\n *         if item is Ellipsis:\n *             if not seen_ellipsis:\n *                 result.extend([slice(None)] * (ndim - len(tup) + 1))             # <<<<<<<<<<<<<<\n *                 seen_ellipsis = True\n *             else:\n */\n  __pyx_slice__18 = PySlice_New(Py_None, Py_None, Py_None); if (unlikely(!__pyx_slice__18)) __PYX_ERR(2, 682, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_slice__18);\n  __Pyx_GIVEREF(__pyx_slice__18);\n\n  /* \"View.MemoryView\":703\n *     for suboffset in suboffsets[:ndim]:\n *         if suboffset >= 0:\n *             raise ValueError(\"Indirect dimensions not supported\")             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_tuple__19 = PyTuple_Pack(1, __pyx_kp_s_Indirect_dimensions_not_supporte); if (unlikely(!__pyx_tuple__19)) __PYX_ERR(2, 703, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__19);\n  __Pyx_GIVEREF(__pyx_tuple__19);\n\n  /* \"(tree fragment)\":2\n * def __reduce_cython__(self):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")             # <<<<<<<<<<<<<<\n * def __setstate_cython__(self, __pyx_state):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n */\n  __pyx_tuple__20 = PyTuple_Pack(1, __pyx_kp_s_no_default___reduce___due_to_non); if (unlikely(!__pyx_tuple__20)) __PYX_ERR(2, 2, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__20);\n  __Pyx_GIVEREF(__pyx_tuple__20);\n\n  /* \"(tree fragment)\":4\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")\n * def __setstate_cython__(self, __pyx_state):\n *     raise TypeError(\"no default __reduce__ due to non-trivial __cinit__\")             # <<<<<<<<<<<<<<\n */\n  __pyx_tuple__21 = PyTuple_Pack(1, __pyx_kp_s_no_default___reduce___due_to_non); if (unlikely(!__pyx_tuple__21)) __PYX_ERR(2, 4, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__21);\n  __Pyx_GIVEREF(__pyx_tuple__21);\n\n  /* \"View.MemoryView\":286\n *         return self.name\n * \n * cdef generic = Enum(\"<strided and direct or indirect>\")             # <<<<<<<<<<<<<<\n * cdef strided = Enum(\"<strided and direct>\") # default\n * cdef indirect = Enum(\"<strided and indirect>\")\n */\n  __pyx_tuple__22 = PyTuple_Pack(1, __pyx_kp_s_strided_and_direct_or_indirect); if (unlikely(!__pyx_tuple__22)) __PYX_ERR(2, 286, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__22);\n  __Pyx_GIVEREF(__pyx_tuple__22);\n\n  /* \"View.MemoryView\":287\n * \n * cdef generic = Enum(\"<strided and direct or indirect>\")\n * cdef strided = Enum(\"<strided and direct>\") # default             # <<<<<<<<<<<<<<\n * cdef indirect = Enum(\"<strided and indirect>\")\n * \n */\n  __pyx_tuple__23 = PyTuple_Pack(1, __pyx_kp_s_strided_and_direct); if (unlikely(!__pyx_tuple__23)) __PYX_ERR(2, 287, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__23);\n  __Pyx_GIVEREF(__pyx_tuple__23);\n\n  /* \"View.MemoryView\":288\n * cdef generic = Enum(\"<strided and direct or indirect>\")\n * cdef strided = Enum(\"<strided and direct>\") # default\n * cdef indirect = Enum(\"<strided and indirect>\")             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_tuple__24 = PyTuple_Pack(1, __pyx_kp_s_strided_and_indirect); if (unlikely(!__pyx_tuple__24)) __PYX_ERR(2, 288, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__24);\n  __Pyx_GIVEREF(__pyx_tuple__24);\n\n  /* \"View.MemoryView\":291\n * \n * \n * cdef contiguous = Enum(\"<contiguous and direct>\")             # <<<<<<<<<<<<<<\n * cdef indirect_contiguous = Enum(\"<contiguous and indirect>\")\n * \n */\n  __pyx_tuple__25 = PyTuple_Pack(1, __pyx_kp_s_contiguous_and_direct); if (unlikely(!__pyx_tuple__25)) __PYX_ERR(2, 291, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__25);\n  __Pyx_GIVEREF(__pyx_tuple__25);\n\n  /* \"View.MemoryView\":292\n * \n * cdef contiguous = Enum(\"<contiguous and direct>\")\n * cdef indirect_contiguous = Enum(\"<contiguous and indirect>\")             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_tuple__26 = PyTuple_Pack(1, __pyx_kp_s_contiguous_and_indirect); if (unlikely(!__pyx_tuple__26)) __PYX_ERR(2, 292, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__26);\n  __Pyx_GIVEREF(__pyx_tuple__26);\n\n  /* \"(tree fragment)\":1\n * def __pyx_unpickle_Enum(__pyx_type, long __pyx_checksum, __pyx_state):             # <<<<<<<<<<<<<<\n *     cdef object __pyx_PickleError\n *     cdef object __pyx_result\n */\n  __pyx_tuple__27 = PyTuple_Pack(5, __pyx_n_s_pyx_type, __pyx_n_s_pyx_checksum, __pyx_n_s_pyx_state, __pyx_n_s_pyx_PickleError, __pyx_n_s_pyx_result); if (unlikely(!__pyx_tuple__27)) __PYX_ERR(2, 1, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_tuple__27);\n  __Pyx_GIVEREF(__pyx_tuple__27);\n  __pyx_codeobj__28 = (PyObject*)__Pyx_PyCode_New(3, 0, 5, 0, CO_OPTIMIZED|CO_NEWLOCALS, __pyx_empty_bytes, __pyx_empty_tuple, __pyx_empty_tuple, __pyx_tuple__27, __pyx_empty_tuple, __pyx_empty_tuple, __pyx_kp_s_stringsource, __pyx_n_s_pyx_unpickle_Enum, 1, __pyx_empty_bytes); if (unlikely(!__pyx_codeobj__28)) __PYX_ERR(2, 1, __pyx_L1_error)\n  __Pyx_RefNannyFinishContext();\n  return 0;\n  __pyx_L1_error:;\n  __Pyx_RefNannyFinishContext();\n  return -1;\n}\n\nstatic CYTHON_SMALL_CODE int __Pyx_InitGlobals(void) {\n  /* InitThreads.init */\n  #if defined(WITH_THREAD) && PY_VERSION_HEX < 0x030700F0\nPyEval_InitThreads();\n#endif\n\nif (unlikely(PyErr_Occurred())) __PYX_ERR(0, 1, __pyx_L1_error)\n\n  if (__Pyx_InitStrings(__pyx_string_tab) < 0) __PYX_ERR(0, 1, __pyx_L1_error);\n  __pyx_int_0 = PyInt_FromLong(0); if (unlikely(!__pyx_int_0)) __PYX_ERR(0, 1, __pyx_L1_error)\n  __pyx_int_1 = PyInt_FromLong(1); if (unlikely(!__pyx_int_1)) __PYX_ERR(0, 1, __pyx_L1_error)\n  __pyx_int_184977713 = PyInt_FromLong(184977713L); if (unlikely(!__pyx_int_184977713)) __PYX_ERR(0, 1, __pyx_L1_error)\n  __pyx_int_neg_1 = PyInt_FromLong(-1); if (unlikely(!__pyx_int_neg_1)) __PYX_ERR(0, 1, __pyx_L1_error)\n  return 0;\n  __pyx_L1_error:;\n  return -1;\n}\n\nstatic CYTHON_SMALL_CODE int __Pyx_modinit_global_init_code(void); /*proto*/\nstatic CYTHON_SMALL_CODE int __Pyx_modinit_variable_export_code(void); /*proto*/\nstatic CYTHON_SMALL_CODE int __Pyx_modinit_function_export_code(void); /*proto*/\nstatic CYTHON_SMALL_CODE int __Pyx_modinit_type_init_code(void); /*proto*/\nstatic CYTHON_SMALL_CODE int __Pyx_modinit_type_import_code(void); /*proto*/\nstatic CYTHON_SMALL_CODE int __Pyx_modinit_variable_import_code(void); /*proto*/\nstatic CYTHON_SMALL_CODE int __Pyx_modinit_function_import_code(void); /*proto*/\n\nstatic int __Pyx_modinit_global_init_code(void) {\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__Pyx_modinit_global_init_code\", 0);\n  /*--- Global init code ---*/\n  generic = Py_None; Py_INCREF(Py_None);\n  strided = Py_None; Py_INCREF(Py_None);\n  indirect = Py_None; Py_INCREF(Py_None);\n  contiguous = Py_None; Py_INCREF(Py_None);\n  indirect_contiguous = Py_None; Py_INCREF(Py_None);\n  __Pyx_RefNannyFinishContext();\n  return 0;\n}\n\nstatic int __Pyx_modinit_variable_export_code(void) {\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__Pyx_modinit_variable_export_code\", 0);\n  /*--- Variable export code ---*/\n  __Pyx_RefNannyFinishContext();\n  return 0;\n}\n\nstatic int __Pyx_modinit_function_export_code(void) {\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__Pyx_modinit_function_export_code\", 0);\n  /*--- Function export code ---*/\n  __Pyx_RefNannyFinishContext();\n  return 0;\n}\n\nstatic int __Pyx_modinit_type_init_code(void) {\n  __Pyx_RefNannyDeclarations\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__Pyx_modinit_type_init_code\", 0);\n  /*--- Type init code ---*/\n  __pyx_vtabptr_array = &__pyx_vtable_array;\n  __pyx_vtable_array.get_memview = (PyObject *(*)(struct __pyx_array_obj *))__pyx_array_get_memview;\n  if (PyType_Ready(&__pyx_type___pyx_array) < 0) __PYX_ERR(2, 105, __pyx_L1_error)\n  #if PY_VERSION_HEX < 0x030800B1\n  __pyx_type___pyx_array.tp_print = 0;\n  #endif\n  if (__Pyx_SetVtable(__pyx_type___pyx_array.tp_dict, __pyx_vtabptr_array) < 0) __PYX_ERR(2, 105, __pyx_L1_error)\n  if (__Pyx_setup_reduce((PyObject*)&__pyx_type___pyx_array) < 0) __PYX_ERR(2, 105, __pyx_L1_error)\n  __pyx_array_type = &__pyx_type___pyx_array;\n  if (PyType_Ready(&__pyx_type___pyx_MemviewEnum) < 0) __PYX_ERR(2, 279, __pyx_L1_error)\n  #if PY_VERSION_HEX < 0x030800B1\n  __pyx_type___pyx_MemviewEnum.tp_print = 0;\n  #endif\n  if ((CYTHON_USE_TYPE_SLOTS && CYTHON_USE_PYTYPE_LOOKUP) && likely(!__pyx_type___pyx_MemviewEnum.tp_dictoffset && __pyx_type___pyx_MemviewEnum.tp_getattro == PyObject_GenericGetAttr)) {\n    __pyx_type___pyx_MemviewEnum.tp_getattro = __Pyx_PyObject_GenericGetAttr;\n  }\n  if (__Pyx_setup_reduce((PyObject*)&__pyx_type___pyx_MemviewEnum) < 0) __PYX_ERR(2, 279, __pyx_L1_error)\n  __pyx_MemviewEnum_type = &__pyx_type___pyx_MemviewEnum;\n  __pyx_vtabptr_memoryview = &__pyx_vtable_memoryview;\n  __pyx_vtable_memoryview.get_item_pointer = (char *(*)(struct __pyx_memoryview_obj *, PyObject *))__pyx_memoryview_get_item_pointer;\n  __pyx_vtable_memoryview.is_slice = (PyObject *(*)(struct __pyx_memoryview_obj *, PyObject *))__pyx_memoryview_is_slice;\n  __pyx_vtable_memoryview.setitem_slice_assignment = (PyObject *(*)(struct __pyx_memoryview_obj *, PyObject *, PyObject *))__pyx_memoryview_setitem_slice_assignment;\n  __pyx_vtable_memoryview.setitem_slice_assign_scalar = (PyObject *(*)(struct __pyx_memoryview_obj *, struct __pyx_memoryview_obj *, PyObject *))__pyx_memoryview_setitem_slice_assign_scalar;\n  __pyx_vtable_memoryview.setitem_indexed = (PyObject *(*)(struct __pyx_memoryview_obj *, PyObject *, PyObject *))__pyx_memoryview_setitem_indexed;\n  __pyx_vtable_memoryview.convert_item_to_object = (PyObject *(*)(struct __pyx_memoryview_obj *, char *))__pyx_memoryview_convert_item_to_object;\n  __pyx_vtable_memoryview.assign_item_from_object = (PyObject *(*)(struct __pyx_memoryview_obj *, char *, PyObject *))__pyx_memoryview_assign_item_from_object;\n  if (PyType_Ready(&__pyx_type___pyx_memoryview) < 0) __PYX_ERR(2, 330, __pyx_L1_error)\n  #if PY_VERSION_HEX < 0x030800B1\n  __pyx_type___pyx_memoryview.tp_print = 0;\n  #endif\n  if ((CYTHON_USE_TYPE_SLOTS && CYTHON_USE_PYTYPE_LOOKUP) && likely(!__pyx_type___pyx_memoryview.tp_dictoffset && __pyx_type___pyx_memoryview.tp_getattro == PyObject_GenericGetAttr)) {\n    __pyx_type___pyx_memoryview.tp_getattro = __Pyx_PyObject_GenericGetAttr;\n  }\n  if (__Pyx_SetVtable(__pyx_type___pyx_memoryview.tp_dict, __pyx_vtabptr_memoryview) < 0) __PYX_ERR(2, 330, __pyx_L1_error)\n  if (__Pyx_setup_reduce((PyObject*)&__pyx_type___pyx_memoryview) < 0) __PYX_ERR(2, 330, __pyx_L1_error)\n  __pyx_memoryview_type = &__pyx_type___pyx_memoryview;\n  __pyx_vtabptr__memoryviewslice = &__pyx_vtable__memoryviewslice;\n  __pyx_vtable__memoryviewslice.__pyx_base = *__pyx_vtabptr_memoryview;\n  __pyx_vtable__memoryviewslice.__pyx_base.convert_item_to_object = (PyObject *(*)(struct __pyx_memoryview_obj *, char *))__pyx_memoryviewslice_convert_item_to_object;\n  __pyx_vtable__memoryviewslice.__pyx_base.assign_item_from_object = (PyObject *(*)(struct __pyx_memoryview_obj *, char *, PyObject *))__pyx_memoryviewslice_assign_item_from_object;\n  __pyx_type___pyx_memoryviewslice.tp_base = __pyx_memoryview_type;\n  if (PyType_Ready(&__pyx_type___pyx_memoryviewslice) < 0) __PYX_ERR(2, 965, __pyx_L1_error)\n  #if PY_VERSION_HEX < 0x030800B1\n  __pyx_type___pyx_memoryviewslice.tp_print = 0;\n  #endif\n  if ((CYTHON_USE_TYPE_SLOTS && CYTHON_USE_PYTYPE_LOOKUP) && likely(!__pyx_type___pyx_memoryviewslice.tp_dictoffset && __pyx_type___pyx_memoryviewslice.tp_getattro == PyObject_GenericGetAttr)) {\n    __pyx_type___pyx_memoryviewslice.tp_getattro = __Pyx_PyObject_GenericGetAttr;\n  }\n  if (__Pyx_SetVtable(__pyx_type___pyx_memoryviewslice.tp_dict, __pyx_vtabptr__memoryviewslice) < 0) __PYX_ERR(2, 965, __pyx_L1_error)\n  if (__Pyx_setup_reduce((PyObject*)&__pyx_type___pyx_memoryviewslice) < 0) __PYX_ERR(2, 965, __pyx_L1_error)\n  __pyx_memoryviewslice_type = &__pyx_type___pyx_memoryviewslice;\n  __Pyx_RefNannyFinishContext();\n  return 0;\n  __pyx_L1_error:;\n  __Pyx_RefNannyFinishContext();\n  return -1;\n}\n\nstatic int __Pyx_modinit_type_import_code(void) {\n  __Pyx_RefNannyDeclarations\n  PyObject *__pyx_t_1 = NULL;\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannySetupContext(\"__Pyx_modinit_type_import_code\", 0);\n  /*--- Type import code ---*/\n  __pyx_t_1 = PyImport_ImportModule(__Pyx_BUILTIN_MODULE_NAME); if (unlikely(!__pyx_t_1)) __PYX_ERR(3, 9, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_ptype_7cpython_4type_type = __Pyx_ImportType(__pyx_t_1, __Pyx_BUILTIN_MODULE_NAME, \"type\", \n  #if defined(PYPY_VERSION_NUM) && PYPY_VERSION_NUM < 0x050B0000\n  sizeof(PyTypeObject),\n  #else\n  sizeof(PyHeapTypeObject),\n  #endif\n  __Pyx_ImportType_CheckSize_Warn);\n   if (!__pyx_ptype_7cpython_4type_type) __PYX_ERR(3, 9, __pyx_L1_error)\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __pyx_t_1 = PyImport_ImportModule(\"numpy\"); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 200, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __pyx_ptype_5numpy_dtype = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"dtype\", sizeof(PyArray_Descr), __Pyx_ImportType_CheckSize_Ignore);\n   if (!__pyx_ptype_5numpy_dtype) __PYX_ERR(1, 200, __pyx_L1_error)\n  __pyx_ptype_5numpy_flatiter = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"flatiter\", sizeof(PyArrayIterObject), __Pyx_ImportType_CheckSize_Ignore);\n   if (!__pyx_ptype_5numpy_flatiter) __PYX_ERR(1, 223, __pyx_L1_error)\n  __pyx_ptype_5numpy_broadcast = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"broadcast\", sizeof(PyArrayMultiIterObject), __Pyx_ImportType_CheckSize_Ignore);\n   if (!__pyx_ptype_5numpy_broadcast) __PYX_ERR(1, 227, __pyx_L1_error)\n  __pyx_ptype_5numpy_ndarray = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"ndarray\", sizeof(PyArrayObject), __Pyx_ImportType_CheckSize_Ignore);\n   if (!__pyx_ptype_5numpy_ndarray) __PYX_ERR(1, 239, __pyx_L1_error)\n  __pyx_ptype_5numpy_generic = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"generic\", sizeof(PyObject), __Pyx_ImportType_CheckSize_Warn);\n   if (!__pyx_ptype_5numpy_generic) __PYX_ERR(1, 771, __pyx_L1_error)\n  __pyx_ptype_5numpy_number = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"number\", sizeof(PyObject), __Pyx_ImportType_CheckSize_Warn);\n   if (!__pyx_ptype_5numpy_number) __PYX_ERR(1, 773, __pyx_L1_error)\n  __pyx_ptype_5numpy_integer = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"integer\", sizeof(PyObject), __Pyx_ImportType_CheckSize_Warn);\n   if (!__pyx_ptype_5numpy_integer) __PYX_ERR(1, 775, __pyx_L1_error)\n  __pyx_ptype_5numpy_signedinteger = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"signedinteger\", sizeof(PyObject), __Pyx_ImportType_CheckSize_Warn);\n   if (!__pyx_ptype_5numpy_signedinteger) __PYX_ERR(1, 777, __pyx_L1_error)\n  __pyx_ptype_5numpy_unsignedinteger = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"unsignedinteger\", sizeof(PyObject), __Pyx_ImportType_CheckSize_Warn);\n   if (!__pyx_ptype_5numpy_unsignedinteger) __PYX_ERR(1, 779, __pyx_L1_error)\n  __pyx_ptype_5numpy_inexact = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"inexact\", sizeof(PyObject), __Pyx_ImportType_CheckSize_Warn);\n   if (!__pyx_ptype_5numpy_inexact) __PYX_ERR(1, 781, __pyx_L1_error)\n  __pyx_ptype_5numpy_floating = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"floating\", sizeof(PyObject), __Pyx_ImportType_CheckSize_Warn);\n   if (!__pyx_ptype_5numpy_floating) __PYX_ERR(1, 783, __pyx_L1_error)\n  __pyx_ptype_5numpy_complexfloating = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"complexfloating\", sizeof(PyObject), __Pyx_ImportType_CheckSize_Warn);\n   if (!__pyx_ptype_5numpy_complexfloating) __PYX_ERR(1, 785, __pyx_L1_error)\n  __pyx_ptype_5numpy_flexible = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"flexible\", sizeof(PyObject), __Pyx_ImportType_CheckSize_Warn);\n   if (!__pyx_ptype_5numpy_flexible) __PYX_ERR(1, 787, __pyx_L1_error)\n  __pyx_ptype_5numpy_character = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"character\", sizeof(PyObject), __Pyx_ImportType_CheckSize_Warn);\n   if (!__pyx_ptype_5numpy_character) __PYX_ERR(1, 789, __pyx_L1_error)\n  __pyx_ptype_5numpy_ufunc = __Pyx_ImportType(__pyx_t_1, \"numpy\", \"ufunc\", sizeof(PyUFuncObject), __Pyx_ImportType_CheckSize_Ignore);\n   if (!__pyx_ptype_5numpy_ufunc) __PYX_ERR(1, 827, __pyx_L1_error)\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  __Pyx_RefNannyFinishContext();\n  return 0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  __Pyx_RefNannyFinishContext();\n  return -1;\n}\n\nstatic int __Pyx_modinit_variable_import_code(void) {\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__Pyx_modinit_variable_import_code\", 0);\n  /*--- Variable import code ---*/\n  __Pyx_RefNannyFinishContext();\n  return 0;\n}\n\nstatic int __Pyx_modinit_function_import_code(void) {\n  __Pyx_RefNannyDeclarations\n  __Pyx_RefNannySetupContext(\"__Pyx_modinit_function_import_code\", 0);\n  /*--- Function import code ---*/\n  __Pyx_RefNannyFinishContext();\n  return 0;\n}\n\n\n#ifndef CYTHON_NO_PYINIT_EXPORT\n#define __Pyx_PyMODINIT_FUNC PyMODINIT_FUNC\n#elif PY_MAJOR_VERSION < 3\n#ifdef __cplusplus\n#define __Pyx_PyMODINIT_FUNC extern \"C\" void\n#else\n#define __Pyx_PyMODINIT_FUNC void\n#endif\n#else\n#ifdef __cplusplus\n#define __Pyx_PyMODINIT_FUNC extern \"C\" PyObject *\n#else\n#define __Pyx_PyMODINIT_FUNC PyObject *\n#endif\n#endif\n\n\n#if PY_MAJOR_VERSION < 3\n__Pyx_PyMODINIT_FUNC initcore(void) CYTHON_SMALL_CODE; /*proto*/\n__Pyx_PyMODINIT_FUNC initcore(void)\n#else\n__Pyx_PyMODINIT_FUNC PyInit_core(void) CYTHON_SMALL_CODE; /*proto*/\n__Pyx_PyMODINIT_FUNC PyInit_core(void)\n#if CYTHON_PEP489_MULTI_PHASE_INIT\n{\n  return PyModuleDef_Init(&__pyx_moduledef);\n}\nstatic CYTHON_SMALL_CODE int __Pyx_check_single_interpreter(void) {\n    #if PY_VERSION_HEX >= 0x030700A1\n    static PY_INT64_T main_interpreter_id = -1;\n    PY_INT64_T current_id = PyInterpreterState_GetID(PyThreadState_Get()->interp);\n    if (main_interpreter_id == -1) {\n        main_interpreter_id = current_id;\n        return (unlikely(current_id == -1)) ? -1 : 0;\n    } else if (unlikely(main_interpreter_id != current_id))\n    #else\n    static PyInterpreterState *main_interpreter = NULL;\n    PyInterpreterState *current_interpreter = PyThreadState_Get()->interp;\n    if (!main_interpreter) {\n        main_interpreter = current_interpreter;\n    } else if (unlikely(main_interpreter != current_interpreter))\n    #endif\n    {\n        PyErr_SetString(\n            PyExc_ImportError,\n            \"Interpreter change detected - this module can only be loaded into one interpreter per process.\");\n        return -1;\n    }\n    return 0;\n}\nstatic CYTHON_SMALL_CODE int __Pyx_copy_spec_to_module(PyObject *spec, PyObject *moddict, const char* from_name, const char* to_name, int allow_none) {\n    PyObject *value = PyObject_GetAttrString(spec, from_name);\n    int result = 0;\n    if (likely(value)) {\n        if (allow_none || value != Py_None) {\n            result = PyDict_SetItemString(moddict, to_name, value);\n        }\n        Py_DECREF(value);\n    } else if (PyErr_ExceptionMatches(PyExc_AttributeError)) {\n        PyErr_Clear();\n    } else {\n        result = -1;\n    }\n    return result;\n}\nstatic CYTHON_SMALL_CODE PyObject* __pyx_pymod_create(PyObject *spec, CYTHON_UNUSED PyModuleDef *def) {\n    PyObject *module = NULL, *moddict, *modname;\n    if (__Pyx_check_single_interpreter())\n        return NULL;\n    if (__pyx_m)\n        return __Pyx_NewRef(__pyx_m);\n    modname = PyObject_GetAttrString(spec, \"name\");\n    if (unlikely(!modname)) goto bad;\n    module = PyModule_NewObject(modname);\n    Py_DECREF(modname);\n    if (unlikely(!module)) goto bad;\n    moddict = PyModule_GetDict(module);\n    if (unlikely(!moddict)) goto bad;\n    if (unlikely(__Pyx_copy_spec_to_module(spec, moddict, \"loader\", \"__loader__\", 1) < 0)) goto bad;\n    if (unlikely(__Pyx_copy_spec_to_module(spec, moddict, \"origin\", \"__file__\", 1) < 0)) goto bad;\n    if (unlikely(__Pyx_copy_spec_to_module(spec, moddict, \"parent\", \"__package__\", 1) < 0)) goto bad;\n    if (unlikely(__Pyx_copy_spec_to_module(spec, moddict, \"submodule_search_locations\", \"__path__\", 0) < 0)) goto bad;\n    return module;\nbad:\n    Py_XDECREF(module);\n    return NULL;\n}\n\n\nstatic CYTHON_SMALL_CODE int __pyx_pymod_exec_core(PyObject *__pyx_pyinit_module)\n#endif\n#endif\n{\n  PyObject *__pyx_t_1 = NULL;\n  static PyThread_type_lock __pyx_t_2[8];\n  int __pyx_lineno = 0;\n  const char *__pyx_filename = NULL;\n  int __pyx_clineno = 0;\n  __Pyx_RefNannyDeclarations\n  #if CYTHON_PEP489_MULTI_PHASE_INIT\n  if (__pyx_m) {\n    if (__pyx_m == __pyx_pyinit_module) return 0;\n    PyErr_SetString(PyExc_RuntimeError, \"Module 'core' has already been imported. Re-initialisation is not supported.\");\n    return -1;\n  }\n  #elif PY_MAJOR_VERSION >= 3\n  if (__pyx_m) return __Pyx_NewRef(__pyx_m);\n  #endif\n  #if CYTHON_REFNANNY\n__Pyx_RefNanny = __Pyx_RefNannyImportAPI(\"refnanny\");\nif (!__Pyx_RefNanny) {\n  PyErr_Clear();\n  __Pyx_RefNanny = __Pyx_RefNannyImportAPI(\"Cython.Runtime.refnanny\");\n  if (!__Pyx_RefNanny)\n      Py_FatalError(\"failed to import 'refnanny' module\");\n}\n#endif\n  __Pyx_RefNannySetupContext(\"__Pyx_PyMODINIT_FUNC PyInit_core(void)\", 0);\n  if (__Pyx_check_binary_version() < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  #ifdef __Pxy_PyFrame_Initialize_Offsets\n  __Pxy_PyFrame_Initialize_Offsets();\n  #endif\n  __pyx_empty_tuple = PyTuple_New(0); if (unlikely(!__pyx_empty_tuple)) __PYX_ERR(0, 1, __pyx_L1_error)\n  __pyx_empty_bytes = PyBytes_FromStringAndSize(\"\", 0); if (unlikely(!__pyx_empty_bytes)) __PYX_ERR(0, 1, __pyx_L1_error)\n  __pyx_empty_unicode = PyUnicode_FromStringAndSize(\"\", 0); if (unlikely(!__pyx_empty_unicode)) __PYX_ERR(0, 1, __pyx_L1_error)\n  #ifdef __Pyx_CyFunction_USED\n  if (__pyx_CyFunction_init() < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  #endif\n  #ifdef __Pyx_FusedFunction_USED\n  if (__pyx_FusedFunction_init() < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  #endif\n  #ifdef __Pyx_Coroutine_USED\n  if (__pyx_Coroutine_init() < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  #endif\n  #ifdef __Pyx_Generator_USED\n  if (__pyx_Generator_init() < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  #endif\n  #ifdef __Pyx_AsyncGen_USED\n  if (__pyx_AsyncGen_init() < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  #endif\n  #ifdef __Pyx_StopAsyncIteration_USED\n  if (__pyx_StopAsyncIteration_init() < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  #endif\n  /*--- Library function declarations ---*/\n  /*--- Threads initialization code ---*/\n  #if defined(WITH_THREAD) && PY_VERSION_HEX < 0x030700F0 && defined(__PYX_FORCE_INIT_THREADS) && __PYX_FORCE_INIT_THREADS\n  PyEval_InitThreads();\n  #endif\n  /*--- Module creation code ---*/\n  #if CYTHON_PEP489_MULTI_PHASE_INIT\n  __pyx_m = __pyx_pyinit_module;\n  Py_INCREF(__pyx_m);\n  #else\n  #if PY_MAJOR_VERSION < 3\n  __pyx_m = Py_InitModule4(\"core\", __pyx_methods, 0, 0, PYTHON_API_VERSION); Py_XINCREF(__pyx_m);\n  #else\n  __pyx_m = PyModule_Create(&__pyx_moduledef);\n  #endif\n  if (unlikely(!__pyx_m)) __PYX_ERR(0, 1, __pyx_L1_error)\n  #endif\n  __pyx_d = PyModule_GetDict(__pyx_m); if (unlikely(!__pyx_d)) __PYX_ERR(0, 1, __pyx_L1_error)\n  Py_INCREF(__pyx_d);\n  __pyx_b = PyImport_AddModule(__Pyx_BUILTIN_MODULE_NAME); if (unlikely(!__pyx_b)) __PYX_ERR(0, 1, __pyx_L1_error)\n  Py_INCREF(__pyx_b);\n  __pyx_cython_runtime = PyImport_AddModule((char *) \"cython_runtime\"); if (unlikely(!__pyx_cython_runtime)) __PYX_ERR(0, 1, __pyx_L1_error)\n  Py_INCREF(__pyx_cython_runtime);\n  if (PyObject_SetAttrString(__pyx_m, \"__builtins__\", __pyx_b) < 0) __PYX_ERR(0, 1, __pyx_L1_error);\n  /*--- Initialize various global constants etc. ---*/\n  if (__Pyx_InitGlobals() < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  #if PY_MAJOR_VERSION < 3 && (__PYX_DEFAULT_STRING_ENCODING_IS_ASCII || __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT)\n  if (__Pyx_init_sys_getdefaultencoding_params() < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  #endif\n  if (__pyx_module_is_main_TTS__tts__utils__monotonic_align__core) {\n    if (PyObject_SetAttr(__pyx_m, __pyx_n_s_name_2, __pyx_n_s_main) < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  }\n  #if PY_MAJOR_VERSION >= 3\n  {\n    PyObject *modules = PyImport_GetModuleDict(); if (unlikely(!modules)) __PYX_ERR(0, 1, __pyx_L1_error)\n    if (!PyDict_GetItemString(modules, \"TTS.tts.utils.monotonic_align.core\")) {\n      if (unlikely(PyDict_SetItemString(modules, \"TTS.tts.utils.monotonic_align.core\", __pyx_m) < 0)) __PYX_ERR(0, 1, __pyx_L1_error)\n    }\n  }\n  #endif\n  /*--- Builtin init code ---*/\n  if (__Pyx_InitCachedBuiltins() < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  /*--- Constants init code ---*/\n  if (__Pyx_InitCachedConstants() < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  /*--- Global type/function init code ---*/\n  (void)__Pyx_modinit_global_init_code();\n  (void)__Pyx_modinit_variable_export_code();\n  (void)__Pyx_modinit_function_export_code();\n  if (unlikely(__Pyx_modinit_type_init_code() < 0)) __PYX_ERR(0, 1, __pyx_L1_error)\n  if (unlikely(__Pyx_modinit_type_import_code() < 0)) __PYX_ERR(0, 1, __pyx_L1_error)\n  (void)__Pyx_modinit_variable_import_code();\n  (void)__Pyx_modinit_function_import_code();\n  /*--- Execution code ---*/\n  #if defined(__Pyx_Generator_USED) || defined(__Pyx_Coroutine_USED)\n  if (__Pyx_patch_abc() < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  #endif\n\n  /* \"TTS/tts/utils/monotonic_align/core.pyx\":1\n * import numpy as np             # <<<<<<<<<<<<<<\n * \n * cimport cython\n */\n  __pyx_t_1 = __Pyx_Import(__pyx_n_s_numpy, 0, 0); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 1, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  if (PyDict_SetItem(__pyx_d, __pyx_n_s_np, __pyx_t_1) < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n\n  /* \"TTS/tts/utils/monotonic_align/core.pyx\":42\n * @cython.boundscheck(False)\n * @cython.wraparound(False)\n * cpdef void maximum_path_c(int[:,:,::1] paths, float[:,:,::1] values, int[::1] t_xs, int[::1] t_ys, float max_neg_val=-1e9) nogil:             # <<<<<<<<<<<<<<\n *   cdef int b = values.shape[0]\n * \n */\n  __pyx_k_ = (-1e9);\n  __pyx_k_ = (-1e9);\n\n  /* \"TTS/tts/utils/monotonic_align/core.pyx\":1\n * import numpy as np             # <<<<<<<<<<<<<<\n * \n * cimport cython\n */\n  __pyx_t_1 = __Pyx_PyDict_NewPresized(0); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 1, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  if (PyDict_SetItem(__pyx_d, __pyx_n_s_test, __pyx_t_1) < 0) __PYX_ERR(0, 1, __pyx_L1_error)\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n\n  /* \"View.MemoryView\":209\n *         info.obj = self\n * \n *     __pyx_getbuffer = capsule(<void *> &__pyx_array_getbuffer, \"getbuffer(obj, view, flags)\")             # <<<<<<<<<<<<<<\n * \n *     def __dealloc__(array self):\n */\n  __pyx_t_1 = __pyx_capsule_create(((void *)(&__pyx_array_getbuffer)), ((char *)\"getbuffer(obj, view, flags)\")); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 209, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  if (PyDict_SetItem((PyObject *)__pyx_array_type->tp_dict, __pyx_n_s_pyx_getbuffer, __pyx_t_1) < 0) __PYX_ERR(2, 209, __pyx_L1_error)\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  PyType_Modified(__pyx_array_type);\n\n  /* \"View.MemoryView\":286\n *         return self.name\n * \n * cdef generic = Enum(\"<strided and direct or indirect>\")             # <<<<<<<<<<<<<<\n * cdef strided = Enum(\"<strided and direct>\") # default\n * cdef indirect = Enum(\"<strided and indirect>\")\n */\n  __pyx_t_1 = __Pyx_PyObject_Call(((PyObject *)__pyx_MemviewEnum_type), __pyx_tuple__22, NULL); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 286, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_XGOTREF(generic);\n  __Pyx_DECREF_SET(generic, __pyx_t_1);\n  __Pyx_GIVEREF(__pyx_t_1);\n  __pyx_t_1 = 0;\n\n  /* \"View.MemoryView\":287\n * \n * cdef generic = Enum(\"<strided and direct or indirect>\")\n * cdef strided = Enum(\"<strided and direct>\") # default             # <<<<<<<<<<<<<<\n * cdef indirect = Enum(\"<strided and indirect>\")\n * \n */\n  __pyx_t_1 = __Pyx_PyObject_Call(((PyObject *)__pyx_MemviewEnum_type), __pyx_tuple__23, NULL); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 287, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_XGOTREF(strided);\n  __Pyx_DECREF_SET(strided, __pyx_t_1);\n  __Pyx_GIVEREF(__pyx_t_1);\n  __pyx_t_1 = 0;\n\n  /* \"View.MemoryView\":288\n * cdef generic = Enum(\"<strided and direct or indirect>\")\n * cdef strided = Enum(\"<strided and direct>\") # default\n * cdef indirect = Enum(\"<strided and indirect>\")             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_t_1 = __Pyx_PyObject_Call(((PyObject *)__pyx_MemviewEnum_type), __pyx_tuple__24, NULL); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 288, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_XGOTREF(indirect);\n  __Pyx_DECREF_SET(indirect, __pyx_t_1);\n  __Pyx_GIVEREF(__pyx_t_1);\n  __pyx_t_1 = 0;\n\n  /* \"View.MemoryView\":291\n * \n * \n * cdef contiguous = Enum(\"<contiguous and direct>\")             # <<<<<<<<<<<<<<\n * cdef indirect_contiguous = Enum(\"<contiguous and indirect>\")\n * \n */\n  __pyx_t_1 = __Pyx_PyObject_Call(((PyObject *)__pyx_MemviewEnum_type), __pyx_tuple__25, NULL); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 291, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_XGOTREF(contiguous);\n  __Pyx_DECREF_SET(contiguous, __pyx_t_1);\n  __Pyx_GIVEREF(__pyx_t_1);\n  __pyx_t_1 = 0;\n\n  /* \"View.MemoryView\":292\n * \n * cdef contiguous = Enum(\"<contiguous and direct>\")\n * cdef indirect_contiguous = Enum(\"<contiguous and indirect>\")             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_t_1 = __Pyx_PyObject_Call(((PyObject *)__pyx_MemviewEnum_type), __pyx_tuple__26, NULL); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 292, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  __Pyx_XGOTREF(indirect_contiguous);\n  __Pyx_DECREF_SET(indirect_contiguous, __pyx_t_1);\n  __Pyx_GIVEREF(__pyx_t_1);\n  __pyx_t_1 = 0;\n\n  /* \"View.MemoryView\":316\n * \n * DEF THREAD_LOCKS_PREALLOCATED = 8\n * cdef int __pyx_memoryview_thread_locks_used = 0             # <<<<<<<<<<<<<<\n * cdef PyThread_type_lock[THREAD_LOCKS_PREALLOCATED] __pyx_memoryview_thread_locks = [\n *     PyThread_allocate_lock(),\n */\n  __pyx_memoryview_thread_locks_used = 0;\n\n  /* \"View.MemoryView\":317\n * DEF THREAD_LOCKS_PREALLOCATED = 8\n * cdef int __pyx_memoryview_thread_locks_used = 0\n * cdef PyThread_type_lock[THREAD_LOCKS_PREALLOCATED] __pyx_memoryview_thread_locks = [             # <<<<<<<<<<<<<<\n *     PyThread_allocate_lock(),\n *     PyThread_allocate_lock(),\n */\n  __pyx_t_2[0] = PyThread_allocate_lock();\n  __pyx_t_2[1] = PyThread_allocate_lock();\n  __pyx_t_2[2] = PyThread_allocate_lock();\n  __pyx_t_2[3] = PyThread_allocate_lock();\n  __pyx_t_2[4] = PyThread_allocate_lock();\n  __pyx_t_2[5] = PyThread_allocate_lock();\n  __pyx_t_2[6] = PyThread_allocate_lock();\n  __pyx_t_2[7] = PyThread_allocate_lock();\n  memcpy(&(__pyx_memoryview_thread_locks[0]), __pyx_t_2, sizeof(__pyx_memoryview_thread_locks[0]) * (8));\n\n  /* \"View.MemoryView\":549\n *         info.obj = self\n * \n *     __pyx_getbuffer = capsule(<void *> &__pyx_memoryview_getbuffer, \"getbuffer(obj, view, flags)\")             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_t_1 = __pyx_capsule_create(((void *)(&__pyx_memoryview_getbuffer)), ((char *)\"getbuffer(obj, view, flags)\")); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 549, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  if (PyDict_SetItem((PyObject *)__pyx_memoryview_type->tp_dict, __pyx_n_s_pyx_getbuffer, __pyx_t_1) < 0) __PYX_ERR(2, 549, __pyx_L1_error)\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  PyType_Modified(__pyx_memoryview_type);\n\n  /* \"View.MemoryView\":995\n *         return self.from_object\n * \n *     __pyx_getbuffer = capsule(<void *> &__pyx_memoryview_getbuffer, \"getbuffer(obj, view, flags)\")             # <<<<<<<<<<<<<<\n * \n * \n */\n  __pyx_t_1 = __pyx_capsule_create(((void *)(&__pyx_memoryview_getbuffer)), ((char *)\"getbuffer(obj, view, flags)\")); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 995, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  if (PyDict_SetItem((PyObject *)__pyx_memoryviewslice_type->tp_dict, __pyx_n_s_pyx_getbuffer, __pyx_t_1) < 0) __PYX_ERR(2, 995, __pyx_L1_error)\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n  PyType_Modified(__pyx_memoryviewslice_type);\n\n  /* \"(tree fragment)\":1\n * def __pyx_unpickle_Enum(__pyx_type, long __pyx_checksum, __pyx_state):             # <<<<<<<<<<<<<<\n *     cdef object __pyx_PickleError\n *     cdef object __pyx_result\n */\n  __pyx_t_1 = PyCFunction_NewEx(&__pyx_mdef_15View_dot_MemoryView_1__pyx_unpickle_Enum, NULL, __pyx_n_s_View_MemoryView); if (unlikely(!__pyx_t_1)) __PYX_ERR(2, 1, __pyx_L1_error)\n  __Pyx_GOTREF(__pyx_t_1);\n  if (PyDict_SetItem(__pyx_d, __pyx_n_s_pyx_unpickle_Enum, __pyx_t_1) < 0) __PYX_ERR(2, 1, __pyx_L1_error)\n  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;\n\n  /* \"(tree fragment)\":11\n *         __pyx_unpickle_Enum__set_state(<Enum> __pyx_result, __pyx_state)\n *     return __pyx_result\n * cdef __pyx_unpickle_Enum__set_state(Enum __pyx_result, tuple __pyx_state):             # <<<<<<<<<<<<<<\n *     __pyx_result.name = __pyx_state[0]\n *     if len(__pyx_state) > 1 and hasattr(__pyx_result, '__dict__'):\n */\n\n  /*--- Wrapped vars code ---*/\n\n  goto __pyx_L0;\n  __pyx_L1_error:;\n  __Pyx_XDECREF(__pyx_t_1);\n  if (__pyx_m) {\n    if (__pyx_d) {\n      __Pyx_AddTraceback(\"init TTS.tts.utils.monotonic_align.core\", __pyx_clineno, __pyx_lineno, __pyx_filename);\n    }\n    Py_CLEAR(__pyx_m);\n  } else if (!PyErr_Occurred()) {\n    PyErr_SetString(PyExc_ImportError, \"init TTS.tts.utils.monotonic_align.core\");\n  }\n  __pyx_L0:;\n  __Pyx_RefNannyFinishContext();\n  #if CYTHON_PEP489_MULTI_PHASE_INIT\n  return (__pyx_m != NULL) ? 0 : -1;\n  #elif PY_MAJOR_VERSION >= 3\n  return __pyx_m;\n  #else\n  return;\n  #endif\n}\n\n/* --- Runtime support code --- */\n/* Refnanny */\n#if CYTHON_REFNANNY\nstatic __Pyx_RefNannyAPIStruct *__Pyx_RefNannyImportAPI(const char *modname) {\n    PyObject *m = NULL, *p = NULL;\n    void *r = NULL;\n    m = PyImport_ImportModule(modname);\n    if (!m) goto end;\n    p = PyObject_GetAttrString(m, \"RefNannyAPI\");\n    if (!p) goto end;\n    r = PyLong_AsVoidPtr(p);\nend:\n    Py_XDECREF(p);\n    Py_XDECREF(m);\n    return (__Pyx_RefNannyAPIStruct *)r;\n}\n#endif\n\n/* PyObjectGetAttrStr */\n#if CYTHON_USE_TYPE_SLOTS\nstatic CYTHON_INLINE PyObject* __Pyx_PyObject_GetAttrStr(PyObject* obj, PyObject* attr_name) {\n    PyTypeObject* tp = Py_TYPE(obj);\n    if (likely(tp->tp_getattro))\n        return tp->tp_getattro(obj, attr_name);\n#if PY_MAJOR_VERSION < 3\n    if (likely(tp->tp_getattr))\n        return tp->tp_getattr(obj, PyString_AS_STRING(attr_name));\n#endif\n    return PyObject_GetAttr(obj, attr_name);\n}\n#endif\n\n/* GetBuiltinName */\nstatic PyObject *__Pyx_GetBuiltinName(PyObject *name) {\n    PyObject* result = __Pyx_PyObject_GetAttrStr(__pyx_b, name);\n    if (unlikely(!result)) {\n        PyErr_Format(PyExc_NameError,\n#if PY_MAJOR_VERSION >= 3\n            \"name '%U' is not defined\", name);\n#else\n            \"name '%.200s' is not defined\", PyString_AS_STRING(name));\n#endif\n    }\n    return result;\n}\n\n/* MemviewSliceInit */\nstatic int\n__Pyx_init_memviewslice(struct __pyx_memoryview_obj *memview,\n                        int ndim,\n                        __Pyx_memviewslice *memviewslice,\n                        int memview_is_new_reference)\n{\n    __Pyx_RefNannyDeclarations\n    int i, retval=-1;\n    Py_buffer *buf = &memview->view;\n    __Pyx_RefNannySetupContext(\"init_memviewslice\", 0);\n    if (unlikely(memviewslice->memview || memviewslice->data)) {\n        PyErr_SetString(PyExc_ValueError,\n            \"memviewslice is already initialized!\");\n        goto fail;\n    }\n    if (buf->strides) {\n        for (i = 0; i < ndim; i++) {\n            memviewslice->strides[i] = buf->strides[i];\n        }\n    } else {\n        Py_ssize_t stride = buf->itemsize;\n        for (i = ndim - 1; i >= 0; i--) {\n            memviewslice->strides[i] = stride;\n            stride *= buf->shape[i];\n        }\n    }\n    for (i = 0; i < ndim; i++) {\n        memviewslice->shape[i]   = buf->shape[i];\n        if (buf->suboffsets) {\n            memviewslice->suboffsets[i] = buf->suboffsets[i];\n        } else {\n            memviewslice->suboffsets[i] = -1;\n        }\n    }\n    memviewslice->memview = memview;\n    memviewslice->data = (char *)buf->buf;\n    if (__pyx_add_acquisition_count(memview) == 0 && !memview_is_new_reference) {\n        Py_INCREF(memview);\n    }\n    retval = 0;\n    goto no_fail;\nfail:\n    memviewslice->memview = 0;\n    memviewslice->data = 0;\n    retval = -1;\nno_fail:\n    __Pyx_RefNannyFinishContext();\n    return retval;\n}\n#ifndef Py_NO_RETURN\n#define Py_NO_RETURN\n#endif\nstatic void __pyx_fatalerror(const char *fmt, ...) Py_NO_RETURN {\n    va_list vargs;\n    char msg[200];\n#ifdef HAVE_STDARG_PROTOTYPES\n    va_start(vargs, fmt);\n#else\n    va_start(vargs);\n#endif\n    vsnprintf(msg, 200, fmt, vargs);\n    va_end(vargs);\n    Py_FatalError(msg);\n}\nstatic CYTHON_INLINE int\n__pyx_add_acquisition_count_locked(__pyx_atomic_int *acquisition_count,\n                                   PyThread_type_lock lock)\n{\n    int result;\n    PyThread_acquire_lock(lock, 1);\n    result = (*acquisition_count)++;\n    PyThread_release_lock(lock);\n    return result;\n}\nstatic CYTHON_INLINE int\n__pyx_sub_acquisition_count_locked(__pyx_atomic_int *acquisition_count,\n                                   PyThread_type_lock lock)\n{\n    int result;\n    PyThread_acquire_lock(lock, 1);\n    result = (*acquisition_count)--;\n    PyThread_release_lock(lock);\n    return result;\n}\nstatic CYTHON_INLINE void\n__Pyx_INC_MEMVIEW(__Pyx_memviewslice *memslice, int have_gil, int lineno)\n{\n    int first_time;\n    struct __pyx_memoryview_obj *memview = memslice->memview;\n    if (unlikely(!memview || (PyObject *) memview == Py_None))\n        return;\n    if (unlikely(__pyx_get_slice_count(memview) < 0))\n        __pyx_fatalerror(\"Acquisition count is %d (line %d)\",\n                         __pyx_get_slice_count(memview), lineno);\n    first_time = __pyx_add_acquisition_count(memview) == 0;\n    if (unlikely(first_time)) {\n        if (have_gil) {\n            Py_INCREF((PyObject *) memview);\n        } else {\n            PyGILState_STATE _gilstate = PyGILState_Ensure();\n            Py_INCREF((PyObject *) memview);\n            PyGILState_Release(_gilstate);\n        }\n    }\n}\nstatic CYTHON_INLINE void __Pyx_XDEC_MEMVIEW(__Pyx_memviewslice *memslice,\n                                             int have_gil, int lineno) {\n    int last_time;\n    struct __pyx_memoryview_obj *memview = memslice->memview;\n    if (unlikely(!memview || (PyObject *) memview == Py_None)) {\n        memslice->memview = NULL;\n        return;\n    }\n    if (unlikely(__pyx_get_slice_count(memview) <= 0))\n        __pyx_fatalerror(\"Acquisition count is %d (line %d)\",\n                         __pyx_get_slice_count(memview), lineno);\n    last_time = __pyx_sub_acquisition_count(memview) == 1;\n    memslice->data = NULL;\n    if (unlikely(last_time)) {\n        if (have_gil) {\n            Py_CLEAR(memslice->memview);\n        } else {\n            PyGILState_STATE _gilstate = PyGILState_Ensure();\n            Py_CLEAR(memslice->memview);\n            PyGILState_Release(_gilstate);\n        }\n    } else {\n        memslice->memview = NULL;\n    }\n}\n\n/* RaiseArgTupleInvalid */\nstatic void __Pyx_RaiseArgtupleInvalid(\n    const char* func_name,\n    int exact,\n    Py_ssize_t num_min,\n    Py_ssize_t num_max,\n    Py_ssize_t num_found)\n{\n    Py_ssize_t num_expected;\n    const char *more_or_less;\n    if (num_found < num_min) {\n        num_expected = num_min;\n        more_or_less = \"at least\";\n    } else {\n        num_expected = num_max;\n        more_or_less = \"at most\";\n    }\n    if (exact) {\n        more_or_less = \"exactly\";\n    }\n    PyErr_Format(PyExc_TypeError,\n                 \"%.200s() takes %.8s %\" CYTHON_FORMAT_SSIZE_T \"d positional argument%.1s (%\" CYTHON_FORMAT_SSIZE_T \"d given)\",\n                 func_name, more_or_less, num_expected,\n                 (num_expected == 1) ? \"\" : \"s\", num_found);\n}\n\n/* RaiseDoubleKeywords */\nstatic void __Pyx_RaiseDoubleKeywordsError(\n    const char* func_name,\n    PyObject* kw_name)\n{\n    PyErr_Format(PyExc_TypeError,\n        #if PY_MAJOR_VERSION >= 3\n        \"%s() got multiple values for keyword argument '%U'\", func_name, kw_name);\n        #else\n        \"%s() got multiple values for keyword argument '%s'\", func_name,\n        PyString_AsString(kw_name));\n        #endif\n}\n\n/* ParseKeywords */\nstatic int __Pyx_ParseOptionalKeywords(\n    PyObject *kwds,\n    PyObject **argnames[],\n    PyObject *kwds2,\n    PyObject *values[],\n    Py_ssize_t num_pos_args,\n    const char* function_name)\n{\n    PyObject *key = 0, *value = 0;\n    Py_ssize_t pos = 0;\n    PyObject*** name;\n    PyObject*** first_kw_arg = argnames + num_pos_args;\n    while (PyDict_Next(kwds, &pos, &key, &value)) {\n        name = first_kw_arg;\n        while (*name && (**name != key)) name++;\n        if (*name) {\n            values[name-argnames] = value;\n            continue;\n        }\n        name = first_kw_arg;\n        #if PY_MAJOR_VERSION < 3\n        if (likely(PyString_Check(key))) {\n            while (*name) {\n                if ((CYTHON_COMPILING_IN_PYPY || PyString_GET_SIZE(**name) == PyString_GET_SIZE(key))\n                        && _PyString_Eq(**name, key)) {\n                    values[name-argnames] = value;\n                    break;\n                }\n                name++;\n            }\n            if (*name) continue;\n            else {\n                PyObject*** argname = argnames;\n                while (argname != first_kw_arg) {\n                    if ((**argname == key) || (\n                            (CYTHON_COMPILING_IN_PYPY || PyString_GET_SIZE(**argname) == PyString_GET_SIZE(key))\n                             && _PyString_Eq(**argname, key))) {\n                        goto arg_passed_twice;\n                    }\n                    argname++;\n                }\n            }\n        } else\n        #endif\n        if (likely(PyUnicode_Check(key))) {\n            while (*name) {\n                int cmp = (**name == key) ? 0 :\n                #if !CYTHON_COMPILING_IN_PYPY && PY_MAJOR_VERSION >= 3\n                    (__Pyx_PyUnicode_GET_LENGTH(**name) != __Pyx_PyUnicode_GET_LENGTH(key)) ? 1 :\n                #endif\n                    PyUnicode_Compare(**name, key);\n                if (cmp < 0 && unlikely(PyErr_Occurred())) goto bad;\n                if (cmp == 0) {\n                    values[name-argnames] = value;\n                    break;\n                }\n                name++;\n            }\n            if (*name) continue;\n            else {\n                PyObject*** argname = argnames;\n                while (argname != first_kw_arg) {\n                    int cmp = (**argname == key) ? 0 :\n                    #if !CYTHON_COMPILING_IN_PYPY && PY_MAJOR_VERSION >= 3\n                        (__Pyx_PyUnicode_GET_LENGTH(**argname) != __Pyx_PyUnicode_GET_LENGTH(key)) ? 1 :\n                    #endif\n                        PyUnicode_Compare(**argname, key);\n                    if (cmp < 0 && unlikely(PyErr_Occurred())) goto bad;\n                    if (cmp == 0) goto arg_passed_twice;\n                    argname++;\n                }\n            }\n        } else\n            goto invalid_keyword_type;\n        if (kwds2) {\n            if (unlikely(PyDict_SetItem(kwds2, key, value))) goto bad;\n        } else {\n            goto invalid_keyword;\n        }\n    }\n    return 0;\narg_passed_twice:\n    __Pyx_RaiseDoubleKeywordsError(function_name, key);\n    goto bad;\ninvalid_keyword_type:\n    PyErr_Format(PyExc_TypeError,\n        \"%.200s() keywords must be strings\", function_name);\n    goto bad;\ninvalid_keyword:\n    PyErr_Format(PyExc_TypeError,\n    #if PY_MAJOR_VERSION < 3\n        \"%.200s() got an unexpected keyword argument '%.200s'\",\n        function_name, PyString_AsString(key));\n    #else\n        \"%s() got an unexpected keyword argument '%U'\",\n        function_name, key);\n    #endif\nbad:\n    return -1;\n}\n\n/* None */\nstatic CYTHON_INLINE void __Pyx_RaiseUnboundLocalError(const char *varname) {\n    PyErr_Format(PyExc_UnboundLocalError, \"local variable '%s' referenced before assignment\", varname);\n}\n\n/* GetTopmostException */\n#if CYTHON_USE_EXC_INFO_STACK\nstatic _PyErr_StackItem *\n__Pyx_PyErr_GetTopmostException(PyThreadState *tstate)\n{\n    _PyErr_StackItem *exc_info = tstate->exc_info;\n    while ((exc_info->exc_type == NULL || exc_info->exc_type == Py_None) &&\n           exc_info->previous_item != NULL)\n    {\n        exc_info = exc_info->previous_item;\n    }\n    return exc_info;\n}\n#endif\n\n/* SaveResetException */\n#if CYTHON_FAST_THREAD_STATE\nstatic CYTHON_INLINE void __Pyx__ExceptionSave(PyThreadState *tstate, PyObject **type, PyObject **value, PyObject **tb) {\n    #if CYTHON_USE_EXC_INFO_STACK\n    _PyErr_StackItem *exc_info = __Pyx_PyErr_GetTopmostException(tstate);\n    *type = exc_info->exc_type;\n    *value = exc_info->exc_value;\n    *tb = exc_info->exc_traceback;\n    #else\n    *type = tstate->exc_type;\n    *value = tstate->exc_value;\n    *tb = tstate->exc_traceback;\n    #endif\n    Py_XINCREF(*type);\n    Py_XINCREF(*value);\n    Py_XINCREF(*tb);\n}\nstatic CYTHON_INLINE void __Pyx__ExceptionReset(PyThreadState *tstate, PyObject *type, PyObject *value, PyObject *tb) {\n    PyObject *tmp_type, *tmp_value, *tmp_tb;\n    #if CYTHON_USE_EXC_INFO_STACK\n    _PyErr_StackItem *exc_info = tstate->exc_info;\n    tmp_type = exc_info->exc_type;\n    tmp_value = exc_info->exc_value;\n    tmp_tb = exc_info->exc_traceback;\n    exc_info->exc_type = type;\n    exc_info->exc_value = value;\n    exc_info->exc_traceback = tb;\n    #else\n    tmp_type = tstate->exc_type;\n    tmp_value = tstate->exc_value;\n    tmp_tb = tstate->exc_traceback;\n    tstate->exc_type = type;\n    tstate->exc_value = value;\n    tstate->exc_traceback = tb;\n    #endif\n    Py_XDECREF(tmp_type);\n    Py_XDECREF(tmp_value);\n    Py_XDECREF(tmp_tb);\n}\n#endif\n\n/* PyErrExceptionMatches */\n#if CYTHON_FAST_THREAD_STATE\nstatic int __Pyx_PyErr_ExceptionMatchesTuple(PyObject *exc_type, PyObject *tuple) {\n    Py_ssize_t i, n;\n    n = PyTuple_GET_SIZE(tuple);\n#if PY_MAJOR_VERSION >= 3\n    for (i=0; i<n; i++) {\n        if (exc_type == PyTuple_GET_ITEM(tuple, i)) return 1;\n    }\n#endif\n    for (i=0; i<n; i++) {\n        if (__Pyx_PyErr_GivenExceptionMatches(exc_type, PyTuple_GET_ITEM(tuple, i))) return 1;\n    }\n    return 0;\n}\nstatic CYTHON_INLINE int __Pyx_PyErr_ExceptionMatchesInState(PyThreadState* tstate, PyObject* err) {\n    PyObject *exc_type = tstate->curexc_type;\n    if (exc_type == err) return 1;\n    if (unlikely(!exc_type)) return 0;\n    if (unlikely(PyTuple_Check(err)))\n        return __Pyx_PyErr_ExceptionMatchesTuple(exc_type, err);\n    return __Pyx_PyErr_GivenExceptionMatches(exc_type, err);\n}\n#endif\n\n/* GetException */\n#if CYTHON_FAST_THREAD_STATE\nstatic int __Pyx__GetException(PyThreadState *tstate, PyObject **type, PyObject **value, PyObject **tb)\n#else\nstatic int __Pyx_GetException(PyObject **type, PyObject **value, PyObject **tb)\n#endif\n{\n    PyObject *local_type, *local_value, *local_tb;\n#if CYTHON_FAST_THREAD_STATE\n    PyObject *tmp_type, *tmp_value, *tmp_tb;\n    local_type = tstate->curexc_type;\n    local_value = tstate->curexc_value;\n    local_tb = tstate->curexc_traceback;\n    tstate->curexc_type = 0;\n    tstate->curexc_value = 0;\n    tstate->curexc_traceback = 0;\n#else\n    PyErr_Fetch(&local_type, &local_value, &local_tb);\n#endif\n    PyErr_NormalizeException(&local_type, &local_value, &local_tb);\n#if CYTHON_FAST_THREAD_STATE\n    if (unlikely(tstate->curexc_type))\n#else\n    if (unlikely(PyErr_Occurred()))\n#endif\n        goto bad;\n    #if PY_MAJOR_VERSION >= 3\n    if (local_tb) {\n        if (unlikely(PyException_SetTraceback(local_value, local_tb) < 0))\n            goto bad;\n    }\n    #endif\n    Py_XINCREF(local_tb);\n    Py_XINCREF(local_type);\n    Py_XINCREF(local_value);\n    *type = local_type;\n    *value = local_value;\n    *tb = local_tb;\n#if CYTHON_FAST_THREAD_STATE\n    #if CYTHON_USE_EXC_INFO_STACK\n    {\n        _PyErr_StackItem *exc_info = tstate->exc_info;\n        tmp_type = exc_info->exc_type;\n        tmp_value = exc_info->exc_value;\n        tmp_tb = exc_info->exc_traceback;\n        exc_info->exc_type = local_type;\n        exc_info->exc_value = local_value;\n        exc_info->exc_traceback = local_tb;\n    }\n    #else\n    tmp_type = tstate->exc_type;\n    tmp_value = tstate->exc_value;\n    tmp_tb = tstate->exc_traceback;\n    tstate->exc_type = local_type;\n    tstate->exc_value = local_value;\n    tstate->exc_traceback = local_tb;\n    #endif\n    Py_XDECREF(tmp_type);\n    Py_XDECREF(tmp_value);\n    Py_XDECREF(tmp_tb);\n#else\n    PyErr_SetExcInfo(local_type, local_value, local_tb);\n#endif\n    return 0;\nbad:\n    *type = 0;\n    *value = 0;\n    *tb = 0;\n    Py_XDECREF(local_type);\n    Py_XDECREF(local_value);\n    Py_XDECREF(local_tb);\n    return -1;\n}\n\n/* PyObjectCall */\n#if CYTHON_COMPILING_IN_CPYTHON\nstatic CYTHON_INLINE PyObject* __Pyx_PyObject_Call(PyObject *func, PyObject *arg, PyObject *kw) {\n    PyObject *result;\n    ternaryfunc call = Py_TYPE(func)->tp_call;\n    if (unlikely(!call))\n        return PyObject_Call(func, arg, kw);\n    if (unlikely(Py_EnterRecursiveCall((char*)\" while calling a Python object\")))\n        return NULL;\n    result = (*call)(func, arg, kw);\n    Py_LeaveRecursiveCall();\n    if (unlikely(!result) && unlikely(!PyErr_Occurred())) {\n        PyErr_SetString(\n            PyExc_SystemError,\n            \"NULL result without error in PyObject_Call\");\n    }\n    return result;\n}\n#endif\n\n/* PyErrFetchRestore */\n#if CYTHON_FAST_THREAD_STATE\nstatic CYTHON_INLINE void __Pyx_ErrRestoreInState(PyThreadState *tstate, PyObject *type, PyObject *value, PyObject *tb) {\n    PyObject *tmp_type, *tmp_value, *tmp_tb;\n    tmp_type = tstate->curexc_type;\n    tmp_value = tstate->curexc_value;\n    tmp_tb = tstate->curexc_traceback;\n    tstate->curexc_type = type;\n    tstate->curexc_value = value;\n    tstate->curexc_traceback = tb;\n    Py_XDECREF(tmp_type);\n    Py_XDECREF(tmp_value);\n    Py_XDECREF(tmp_tb);\n}\nstatic CYTHON_INLINE void __Pyx_ErrFetchInState(PyThreadState *tstate, PyObject **type, PyObject **value, PyObject **tb) {\n    *type = tstate->curexc_type;\n    *value = tstate->curexc_value;\n    *tb = tstate->curexc_traceback;\n    tstate->curexc_type = 0;\n    tstate->curexc_value = 0;\n    tstate->curexc_traceback = 0;\n}\n#endif\n\n/* RaiseException */\n#if PY_MAJOR_VERSION < 3\nstatic void __Pyx_Raise(PyObject *type, PyObject *value, PyObject *tb,\n                        CYTHON_UNUSED PyObject *cause) {\n    __Pyx_PyThreadState_declare\n    Py_XINCREF(type);\n    if (!value || value == Py_None)\n        value = NULL;\n    else\n        Py_INCREF(value);\n    if (!tb || tb == Py_None)\n        tb = NULL;\n    else {\n        Py_INCREF(tb);\n        if (!PyTraceBack_Check(tb)) {\n            PyErr_SetString(PyExc_TypeError,\n                \"raise: arg 3 must be a traceback or None\");\n            goto raise_error;\n        }\n    }\n    if (PyType_Check(type)) {\n#if CYTHON_COMPILING_IN_PYPY\n        if (!value) {\n            Py_INCREF(Py_None);\n            value = Py_None;\n        }\n#endif\n        PyErr_NormalizeException(&type, &value, &tb);\n    } else {\n        if (value) {\n            PyErr_SetString(PyExc_TypeError,\n                \"instance exception may not have a separate value\");\n            goto raise_error;\n        }\n        value = type;\n        type = (PyObject*) Py_TYPE(type);\n        Py_INCREF(type);\n        if (!PyType_IsSubtype((PyTypeObject *)type, (PyTypeObject *)PyExc_BaseException)) {\n            PyErr_SetString(PyExc_TypeError,\n                \"raise: exception class must be a subclass of BaseException\");\n            goto raise_error;\n        }\n    }\n    __Pyx_PyThreadState_assign\n    __Pyx_ErrRestore(type, value, tb);\n    return;\nraise_error:\n    Py_XDECREF(value);\n    Py_XDECREF(type);\n    Py_XDECREF(tb);\n    return;\n}\n#else\nstatic void __Pyx_Raise(PyObject *type, PyObject *value, PyObject *tb, PyObject *cause) {\n    PyObject* owned_instance = NULL;\n    if (tb == Py_None) {\n        tb = 0;\n    } else if (tb && !PyTraceBack_Check(tb)) {\n        PyErr_SetString(PyExc_TypeError,\n            \"raise: arg 3 must be a traceback or None\");\n        goto bad;\n    }\n    if (value == Py_None)\n        value = 0;\n    if (PyExceptionInstance_Check(type)) {\n        if (value) {\n            PyErr_SetString(PyExc_TypeError,\n                \"instance exception may not have a separate value\");\n            goto bad;\n        }\n        value = type;\n        type = (PyObject*) Py_TYPE(value);\n    } else if (PyExceptionClass_Check(type)) {\n        PyObject *instance_class = NULL;\n        if (value && PyExceptionInstance_Check(value)) {\n            instance_class = (PyObject*) Py_TYPE(value);\n            if (instance_class != type) {\n                int is_subclass = PyObject_IsSubclass(instance_class, type);\n                if (!is_subclass) {\n                    instance_class = NULL;\n                } else if (unlikely(is_subclass == -1)) {\n                    goto bad;\n                } else {\n                    type = instance_class;\n                }\n            }\n        }\n        if (!instance_class) {\n            PyObject *args;\n            if (!value)\n                args = PyTuple_New(0);\n            else if (PyTuple_Check(value)) {\n                Py_INCREF(value);\n                args = value;\n            } else\n                args = PyTuple_Pack(1, value);\n            if (!args)\n                goto bad;\n            owned_instance = PyObject_Call(type, args, NULL);\n            Py_DECREF(args);\n            if (!owned_instance)\n                goto bad;\n            value = owned_instance;\n            if (!PyExceptionInstance_Check(value)) {\n                PyErr_Format(PyExc_TypeError,\n                             \"calling %R should have returned an instance of \"\n                             \"BaseException, not %R\",\n                             type, Py_TYPE(value));\n                goto bad;\n            }\n        }\n    } else {\n        PyErr_SetString(PyExc_TypeError,\n            \"raise: exception class must be a subclass of BaseException\");\n        goto bad;\n    }\n    if (cause) {\n        PyObject *fixed_cause;\n        if (cause == Py_None) {\n            fixed_cause = NULL;\n        } else if (PyExceptionClass_Check(cause)) {\n            fixed_cause = PyObject_CallObject(cause, NULL);\n            if (fixed_cause == NULL)\n                goto bad;\n        } else if (PyExceptionInstance_Check(cause)) {\n            fixed_cause = cause;\n            Py_INCREF(fixed_cause);\n        } else {\n            PyErr_SetString(PyExc_TypeError,\n                            \"exception causes must derive from \"\n                            \"BaseException\");\n            goto bad;\n        }\n        PyException_SetCause(value, fixed_cause);\n    }\n    PyErr_SetObject(type, value);\n    if (tb) {\n#if CYTHON_COMPILING_IN_PYPY\n        PyObject *tmp_type, *tmp_value, *tmp_tb;\n        PyErr_Fetch(&tmp_type, &tmp_value, &tmp_tb);\n        Py_INCREF(tb);\n        PyErr_Restore(tmp_type, tmp_value, tb);\n        Py_XDECREF(tmp_tb);\n#else\n        PyThreadState *tstate = __Pyx_PyThreadState_Current;\n        PyObject* tmp_tb = tstate->curexc_traceback;\n        if (tb != tmp_tb) {\n            Py_INCREF(tb);\n            tstate->curexc_traceback = tb;\n            Py_XDECREF(tmp_tb);\n        }\n#endif\n    }\nbad:\n    Py_XDECREF(owned_instance);\n    return;\n}\n#endif\n\n/* ArgTypeTest */\nstatic int __Pyx__ArgTypeTest(PyObject *obj, PyTypeObject *type, const char *name, int exact)\n{\n    if (unlikely(!type)) {\n        PyErr_SetString(PyExc_SystemError, \"Missing type object\");\n        return 0;\n    }\n    else if (exact) {\n        #if PY_MAJOR_VERSION == 2\n        if ((type == &PyBaseString_Type) && likely(__Pyx_PyBaseString_CheckExact(obj))) return 1;\n        #endif\n    }\n    else {\n        if (likely(__Pyx_TypeCheck(obj, type))) return 1;\n    }\n    PyErr_Format(PyExc_TypeError,\n        \"Argument '%.200s' has incorrect type (expected %.200s, got %.200s)\",\n        name, type->tp_name, Py_TYPE(obj)->tp_name);\n    return 0;\n}\n\n/* PyCFunctionFastCall */\n#if CYTHON_FAST_PYCCALL\nstatic CYTHON_INLINE PyObject * __Pyx_PyCFunction_FastCall(PyObject *func_obj, PyObject **args, Py_ssize_t nargs) {\n    PyCFunctionObject *func = (PyCFunctionObject*)func_obj;\n    PyCFunction meth = PyCFunction_GET_FUNCTION(func);\n    PyObject *self = PyCFunction_GET_SELF(func);\n    int flags = PyCFunction_GET_FLAGS(func);\n    assert(PyCFunction_Check(func));\n    assert(METH_FASTCALL == (flags & ~(METH_CLASS | METH_STATIC | METH_COEXIST | METH_KEYWORDS | METH_STACKLESS)));\n    assert(nargs >= 0);\n    assert(nargs == 0 || args != NULL);\n    /* _PyCFunction_FastCallDict() must not be called with an exception set,\n       because it may clear it (directly or indirectly) and so the\n       caller loses its exception */\n    assert(!PyErr_Occurred());\n    if ((PY_VERSION_HEX < 0x030700A0) || unlikely(flags & METH_KEYWORDS)) {\n        return (*((__Pyx_PyCFunctionFastWithKeywords)(void*)meth)) (self, args, nargs, NULL);\n    } else {\n        return (*((__Pyx_PyCFunctionFast)(void*)meth)) (self, args, nargs);\n    }\n}\n#endif\n\n/* PyFunctionFastCall */\n#if CYTHON_FAST_PYCALL\nstatic PyObject* __Pyx_PyFunction_FastCallNoKw(PyCodeObject *co, PyObject **args, Py_ssize_t na,\n                                               PyObject *globals) {\n    PyFrameObject *f;\n    PyThreadState *tstate = __Pyx_PyThreadState_Current;\n    PyObject **fastlocals;\n    Py_ssize_t i;\n    PyObject *result;\n    assert(globals != NULL);\n    /* XXX Perhaps we should create a specialized\n       PyFrame_New() that doesn't take locals, but does\n       take builtins without sanity checking them.\n       */\n    assert(tstate != NULL);\n    f = PyFrame_New(tstate, co, globals, NULL);\n    if (f == NULL) {\n        return NULL;\n    }\n    fastlocals = __Pyx_PyFrame_GetLocalsplus(f);\n    for (i = 0; i < na; i++) {\n        Py_INCREF(*args);\n        fastlocals[i] = *args++;\n    }\n    result = PyEval_EvalFrameEx(f,0);\n    ++tstate->recursion_depth;\n    Py_DECREF(f);\n    --tstate->recursion_depth;\n    return result;\n}\n#if 1 || PY_VERSION_HEX < 0x030600B1\nstatic PyObject *__Pyx_PyFunction_FastCallDict(PyObject *func, PyObject **args, Py_ssize_t nargs, PyObject *kwargs) {\n    PyCodeObject *co = (PyCodeObject *)PyFunction_GET_CODE(func);\n    PyObject *globals = PyFunction_GET_GLOBALS(func);\n    PyObject *argdefs = PyFunction_GET_DEFAULTS(func);\n    PyObject *closure;\n#if PY_MAJOR_VERSION >= 3\n    PyObject *kwdefs;\n#endif\n    PyObject *kwtuple, **k;\n    PyObject **d;\n    Py_ssize_t nd;\n    Py_ssize_t nk;\n    PyObject *result;\n    assert(kwargs == NULL || PyDict_Check(kwargs));\n    nk = kwargs ? PyDict_Size(kwargs) : 0;\n    if (Py_EnterRecursiveCall((char*)\" while calling a Python object\")) {\n        return NULL;\n    }\n    if (\n#if PY_MAJOR_VERSION >= 3\n            co->co_kwonlyargcount == 0 &&\n#endif\n            likely(kwargs == NULL || nk == 0) &&\n            co->co_flags == (CO_OPTIMIZED | CO_NEWLOCALS | CO_NOFREE)) {\n        if (argdefs == NULL && co->co_argcount == nargs) {\n            result = __Pyx_PyFunction_FastCallNoKw(co, args, nargs, globals);\n            goto done;\n        }\n        else if (nargs == 0 && argdefs != NULL\n                 && co->co_argcount == Py_SIZE(argdefs)) {\n            /* function called with no arguments, but all parameters have\n               a default value: use default values as arguments .*/\n            args = &PyTuple_GET_ITEM(argdefs, 0);\n            result =__Pyx_PyFunction_FastCallNoKw(co, args, Py_SIZE(argdefs), globals);\n            goto done;\n        }\n    }\n    if (kwargs != NULL) {\n        Py_ssize_t pos, i;\n        kwtuple = PyTuple_New(2 * nk);\n        if (kwtuple == NULL) {\n            result = NULL;\n            goto done;\n        }\n        k = &PyTuple_GET_ITEM(kwtuple, 0);\n        pos = i = 0;\n        while (PyDict_Next(kwargs, &pos, &k[i], &k[i+1])) {\n            Py_INCREF(k[i]);\n            Py_INCREF(k[i+1]);\n            i += 2;\n        }\n        nk = i / 2;\n    }\n    else {\n        kwtuple = NULL;\n        k = NULL;\n    }\n    closure = PyFunction_GET_CLOSURE(func);\n#if PY_MAJOR_VERSION >= 3\n    kwdefs = PyFunction_GET_KW_DEFAULTS(func);\n#endif\n    if (argdefs != NULL) {\n        d = &PyTuple_GET_ITEM(argdefs, 0);\n        nd = Py_SIZE(argdefs);\n    }\n    else {\n        d = NULL;\n        nd = 0;\n    }\n#if PY_MAJOR_VERSION >= 3\n    result = PyEval_EvalCodeEx((PyObject*)co, globals, (PyObject *)NULL,\n                               args, (int)nargs,\n                               k, (int)nk,\n                               d, (int)nd, kwdefs, closure);\n#else\n    result = PyEval_EvalCodeEx(co, globals, (PyObject *)NULL,\n                               args, (int)nargs,\n                               k, (int)nk,\n                               d, (int)nd, closure);\n#endif\n    Py_XDECREF(kwtuple);\ndone:\n    Py_LeaveRecursiveCall();\n    return result;\n}\n#endif\n#endif\n\n/* PyObjectCall2Args */\nstatic CYTHON_UNUSED PyObject* __Pyx_PyObject_Call2Args(PyObject* function, PyObject* arg1, PyObject* arg2) {\n    PyObject *args, *result = NULL;\n    #if CYTHON_FAST_PYCALL\n    if (PyFunction_Check(function)) {\n        PyObject *args[2] = {arg1, arg2};\n        return __Pyx_PyFunction_FastCall(function, args, 2);\n    }\n    #endif\n    #if CYTHON_FAST_PYCCALL\n    if (__Pyx_PyFastCFunction_Check(function)) {\n        PyObject *args[2] = {arg1, arg2};\n        return __Pyx_PyCFunction_FastCall(function, args, 2);\n    }\n    #endif\n    args = PyTuple_New(2);\n    if (unlikely(!args)) goto done;\n    Py_INCREF(arg1);\n    PyTuple_SET_ITEM(args, 0, arg1);\n    Py_INCREF(arg2);\n    PyTuple_SET_ITEM(args, 1, arg2);\n    Py_INCREF(function);\n    result = __Pyx_PyObject_Call(function, args, NULL);\n    Py_DECREF(args);\n    Py_DECREF(function);\ndone:\n    return result;\n}\n\n/* PyObjectCallMethO */\n#if CYTHON_COMPILING_IN_CPYTHON\nstatic CYTHON_INLINE PyObject* __Pyx_PyObject_CallMethO(PyObject *func, PyObject *arg) {\n    PyObject *self, *result;\n    PyCFunction cfunc;\n    cfunc = PyCFunction_GET_FUNCTION(func);\n    self = PyCFunction_GET_SELF(func);\n    if (unlikely(Py_EnterRecursiveCall((char*)\" while calling a Python object\")))\n        return NULL;\n    result = cfunc(self, arg);\n    Py_LeaveRecursiveCall();\n    if (unlikely(!result) && unlikely(!PyErr_Occurred())) {\n        PyErr_SetString(\n            PyExc_SystemError,\n            \"NULL result without error in PyObject_Call\");\n    }\n    return result;\n}\n#endif\n\n/* PyObjectCallOneArg */\n#if CYTHON_COMPILING_IN_CPYTHON\nstatic PyObject* __Pyx__PyObject_CallOneArg(PyObject *func, PyObject *arg) {\n    PyObject *result;\n    PyObject *args = PyTuple_New(1);\n    if (unlikely(!args)) return NULL;\n    Py_INCREF(arg);\n    PyTuple_SET_ITEM(args, 0, arg);\n    result = __Pyx_PyObject_Call(func, args, NULL);\n    Py_DECREF(args);\n    return result;\n}\nstatic CYTHON_INLINE PyObject* __Pyx_PyObject_CallOneArg(PyObject *func, PyObject *arg) {\n#if CYTHON_FAST_PYCALL\n    if (PyFunction_Check(func)) {\n        return __Pyx_PyFunction_FastCall(func, &arg, 1);\n    }\n#endif\n    if (likely(PyCFunction_Check(func))) {\n        if (likely(PyCFunction_GET_FLAGS(func) & METH_O)) {\n            return __Pyx_PyObject_CallMethO(func, arg);\n#if CYTHON_FAST_PYCCALL\n        } else if (__Pyx_PyFastCFunction_Check(func)) {\n            return __Pyx_PyCFunction_FastCall(func, &arg, 1);\n#endif\n        }\n    }\n    return __Pyx__PyObject_CallOneArg(func, arg);\n}\n#else\nstatic CYTHON_INLINE PyObject* __Pyx_PyObject_CallOneArg(PyObject *func, PyObject *arg) {\n    PyObject *result;\n    PyObject *args = PyTuple_Pack(1, arg);\n    if (unlikely(!args)) return NULL;\n    result = __Pyx_PyObject_Call(func, args, NULL);\n    Py_DECREF(args);\n    return result;\n}\n#endif\n\n/* BytesEquals */\nstatic CYTHON_INLINE int __Pyx_PyBytes_Equals(PyObject* s1, PyObject* s2, int equals) {\n#if CYTHON_COMPILING_IN_PYPY\n    return PyObject_RichCompareBool(s1, s2, equals);\n#else\n    if (s1 == s2) {\n        return (equals == Py_EQ);\n    } else if (PyBytes_CheckExact(s1) & PyBytes_CheckExact(s2)) {\n        const char *ps1, *ps2;\n        Py_ssize_t length = PyBytes_GET_SIZE(s1);\n        if (length != PyBytes_GET_SIZE(s2))\n            return (equals == Py_NE);\n        ps1 = PyBytes_AS_STRING(s1);\n        ps2 = PyBytes_AS_STRING(s2);\n        if (ps1[0] != ps2[0]) {\n            return (equals == Py_NE);\n        } else if (length == 1) {\n            return (equals == Py_EQ);\n        } else {\n            int result;\n#if CYTHON_USE_UNICODE_INTERNALS\n            Py_hash_t hash1, hash2;\n            hash1 = ((PyBytesObject*)s1)->ob_shash;\n            hash2 = ((PyBytesObject*)s2)->ob_shash;\n            if (hash1 != hash2 && hash1 != -1 && hash2 != -1) {\n                return (equals == Py_NE);\n            }\n#endif\n            result = memcmp(ps1, ps2, (size_t)length);\n            return (equals == Py_EQ) ? (result == 0) : (result != 0);\n        }\n    } else if ((s1 == Py_None) & PyBytes_CheckExact(s2)) {\n        return (equals == Py_NE);\n    } else if ((s2 == Py_None) & PyBytes_CheckExact(s1)) {\n        return (equals == Py_NE);\n    } else {\n        int result;\n        PyObject* py_result = PyObject_RichCompare(s1, s2, equals);\n        if (!py_result)\n            return -1;\n        result = __Pyx_PyObject_IsTrue(py_result);\n        Py_DECREF(py_result);\n        return result;\n    }\n#endif\n}\n\n/* UnicodeEquals */\nstatic CYTHON_INLINE int __Pyx_PyUnicode_Equals(PyObject* s1, PyObject* s2, int equals) {\n#if CYTHON_COMPILING_IN_PYPY\n    return PyObject_RichCompareBool(s1, s2, equals);\n#else\n#if PY_MAJOR_VERSION < 3\n    PyObject* owned_ref = NULL;\n#endif\n    int s1_is_unicode, s2_is_unicode;\n    if (s1 == s2) {\n        goto return_eq;\n    }\n    s1_is_unicode = PyUnicode_CheckExact(s1);\n    s2_is_unicode = PyUnicode_CheckExact(s2);\n#if PY_MAJOR_VERSION < 3\n    if ((s1_is_unicode & (!s2_is_unicode)) && PyString_CheckExact(s2)) {\n        owned_ref = PyUnicode_FromObject(s2);\n        if (unlikely(!owned_ref))\n            return -1;\n        s2 = owned_ref;\n        s2_is_unicode = 1;\n    } else if ((s2_is_unicode & (!s1_is_unicode)) && PyString_CheckExact(s1)) {\n        owned_ref = PyUnicode_FromObject(s1);\n        if (unlikely(!owned_ref))\n            return -1;\n        s1 = owned_ref;\n        s1_is_unicode = 1;\n    } else if (((!s2_is_unicode) & (!s1_is_unicode))) {\n        return __Pyx_PyBytes_Equals(s1, s2, equals);\n    }\n#endif\n    if (s1_is_unicode & s2_is_unicode) {\n        Py_ssize_t length;\n        int kind;\n        void *data1, *data2;\n        if (unlikely(__Pyx_PyUnicode_READY(s1) < 0) || unlikely(__Pyx_PyUnicode_READY(s2) < 0))\n            return -1;\n        length = __Pyx_PyUnicode_GET_LENGTH(s1);\n        if (length != __Pyx_PyUnicode_GET_LENGTH(s2)) {\n            goto return_ne;\n        }\n#if CYTHON_USE_UNICODE_INTERNALS\n        {\n            Py_hash_t hash1, hash2;\n        #if CYTHON_PEP393_ENABLED\n            hash1 = ((PyASCIIObject*)s1)->hash;\n            hash2 = ((PyASCIIObject*)s2)->hash;\n        #else\n            hash1 = ((PyUnicodeObject*)s1)->hash;\n            hash2 = ((PyUnicodeObject*)s2)->hash;\n        #endif\n            if (hash1 != hash2 && hash1 != -1 && hash2 != -1) {\n                goto return_ne;\n            }\n        }\n#endif\n        kind = __Pyx_PyUnicode_KIND(s1);\n        if (kind != __Pyx_PyUnicode_KIND(s2)) {\n            goto return_ne;\n        }\n        data1 = __Pyx_PyUnicode_DATA(s1);\n        data2 = __Pyx_PyUnicode_DATA(s2);\n        if (__Pyx_PyUnicode_READ(kind, data1, 0) != __Pyx_PyUnicode_READ(kind, data2, 0)) {\n            goto return_ne;\n        } else if (length == 1) {\n            goto return_eq;\n        } else {\n            int result = memcmp(data1, data2, (size_t)(length * kind));\n            #if PY_MAJOR_VERSION < 3\n            Py_XDECREF(owned_ref);\n            #endif\n            return (equals == Py_EQ) ? (result == 0) : (result != 0);\n        }\n    } else if ((s1 == Py_None) & s2_is_unicode) {\n        goto return_ne;\n    } else if ((s2 == Py_None) & s1_is_unicode) {\n        goto return_ne;\n    } else {\n        int result;\n        PyObject* py_result = PyObject_RichCompare(s1, s2, equals);\n        #if PY_MAJOR_VERSION < 3\n        Py_XDECREF(owned_ref);\n        #endif\n        if (!py_result)\n            return -1;\n        result = __Pyx_PyObject_IsTrue(py_result);\n        Py_DECREF(py_result);\n        return result;\n    }\nreturn_eq:\n    #if PY_MAJOR_VERSION < 3\n    Py_XDECREF(owned_ref);\n    #endif\n    return (equals == Py_EQ);\nreturn_ne:\n    #if PY_MAJOR_VERSION < 3\n    Py_XDECREF(owned_ref);\n    #endif\n    return (equals == Py_NE);\n#endif\n}\n\n/* DivInt[Py_ssize_t] */\nstatic CYTHON_INLINE Py_ssize_t __Pyx_div_Py_ssize_t(Py_ssize_t a, Py_ssize_t b) {\n    Py_ssize_t q = a / b;\n    Py_ssize_t r = a - q*b;\n    q -= ((r != 0) & ((r ^ b) < 0));\n    return q;\n}\n\n/* GetAttr */\nstatic CYTHON_INLINE PyObject *__Pyx_GetAttr(PyObject *o, PyObject *n) {\n#if CYTHON_USE_TYPE_SLOTS\n#if PY_MAJOR_VERSION >= 3\n    if (likely(PyUnicode_Check(n)))\n#else\n    if (likely(PyString_Check(n)))\n#endif\n        return __Pyx_PyObject_GetAttrStr(o, n);\n#endif\n    return PyObject_GetAttr(o, n);\n}\n\n/* GetItemInt */\nstatic PyObject *__Pyx_GetItemInt_Generic(PyObject *o, PyObject* j) {\n    PyObject *r;\n    if (!j) return NULL;\n    r = PyObject_GetItem(o, j);\n    Py_DECREF(j);\n    return r;\n}\nstatic CYTHON_INLINE PyObject *__Pyx_GetItemInt_List_Fast(PyObject *o, Py_ssize_t i,\n                                                              CYTHON_NCP_UNUSED int wraparound,\n                                                              CYTHON_NCP_UNUSED int boundscheck) {\n#if CYTHON_ASSUME_SAFE_MACROS && !CYTHON_AVOID_BORROWED_REFS\n    Py_ssize_t wrapped_i = i;\n    if (wraparound & unlikely(i < 0)) {\n        wrapped_i += PyList_GET_SIZE(o);\n    }\n    if ((!boundscheck) || likely(__Pyx_is_valid_index(wrapped_i, PyList_GET_SIZE(o)))) {\n        PyObject *r = PyList_GET_ITEM(o, wrapped_i);\n        Py_INCREF(r);\n        return r;\n    }\n    return __Pyx_GetItemInt_Generic(o, PyInt_FromSsize_t(i));\n#else\n    return PySequence_GetItem(o, i);\n#endif\n}\nstatic CYTHON_INLINE PyObject *__Pyx_GetItemInt_Tuple_Fast(PyObject *o, Py_ssize_t i,\n                                                              CYTHON_NCP_UNUSED int wraparound,\n                                                              CYTHON_NCP_UNUSED int boundscheck) {\n#if CYTHON_ASSUME_SAFE_MACROS && !CYTHON_AVOID_BORROWED_REFS\n    Py_ssize_t wrapped_i = i;\n    if (wraparound & unlikely(i < 0)) {\n        wrapped_i += PyTuple_GET_SIZE(o);\n    }\n    if ((!boundscheck) || likely(__Pyx_is_valid_index(wrapped_i, PyTuple_GET_SIZE(o)))) {\n        PyObject *r = PyTuple_GET_ITEM(o, wrapped_i);\n        Py_INCREF(r);\n        return r;\n    }\n    return __Pyx_GetItemInt_Generic(o, PyInt_FromSsize_t(i));\n#else\n    return PySequence_GetItem(o, i);\n#endif\n}\nstatic CYTHON_INLINE PyObject *__Pyx_GetItemInt_Fast(PyObject *o, Py_ssize_t i, int is_list,\n                                                     CYTHON_NCP_UNUSED int wraparound,\n                                                     CYTHON_NCP_UNUSED int boundscheck) {\n#if CYTHON_ASSUME_SAFE_MACROS && !CYTHON_AVOID_BORROWED_REFS && CYTHON_USE_TYPE_SLOTS\n    if (is_list || PyList_CheckExact(o)) {\n        Py_ssize_t n = ((!wraparound) | likely(i >= 0)) ? i : i + PyList_GET_SIZE(o);\n        if ((!boundscheck) || (likely(__Pyx_is_valid_index(n, PyList_GET_SIZE(o))))) {\n            PyObject *r = PyList_GET_ITEM(o, n);\n            Py_INCREF(r);\n            return r;\n        }\n    }\n    else if (PyTuple_CheckExact(o)) {\n        Py_ssize_t n = ((!wraparound) | likely(i >= 0)) ? i : i + PyTuple_GET_SIZE(o);\n        if ((!boundscheck) || likely(__Pyx_is_valid_index(n, PyTuple_GET_SIZE(o)))) {\n            PyObject *r = PyTuple_GET_ITEM(o, n);\n            Py_INCREF(r);\n            return r;\n        }\n    } else {\n        PySequenceMethods *m = Py_TYPE(o)->tp_as_sequence;\n        if (likely(m && m->sq_item)) {\n            if (wraparound && unlikely(i < 0) && likely(m->sq_length)) {\n                Py_ssize_t l = m->sq_length(o);\n                if (likely(l >= 0)) {\n                    i += l;\n                } else {\n                    if (!PyErr_ExceptionMatches(PyExc_OverflowError))\n                        return NULL;\n                    PyErr_Clear();\n                }\n            }\n            return m->sq_item(o, i);\n        }\n    }\n#else\n    if (is_list || PySequence_Check(o)) {\n        return PySequence_GetItem(o, i);\n    }\n#endif\n    return __Pyx_GetItemInt_Generic(o, PyInt_FromSsize_t(i));\n}\n\n/* ObjectGetItem */\n#if CYTHON_USE_TYPE_SLOTS\nstatic PyObject *__Pyx_PyObject_GetIndex(PyObject *obj, PyObject* index) {\n    PyObject *runerr;\n    Py_ssize_t key_value;\n    PySequenceMethods *m = Py_TYPE(obj)->tp_as_sequence;\n    if (unlikely(!(m && m->sq_item))) {\n        PyErr_Format(PyExc_TypeError, \"'%.200s' object is not subscriptable\", Py_TYPE(obj)->tp_name);\n        return NULL;\n    }\n    key_value = __Pyx_PyIndex_AsSsize_t(index);\n    if (likely(key_value != -1 || !(runerr = PyErr_Occurred()))) {\n        return __Pyx_GetItemInt_Fast(obj, key_value, 0, 1, 1);\n    }\n    if (PyErr_GivenExceptionMatches(runerr, PyExc_OverflowError)) {\n        PyErr_Clear();\n        PyErr_Format(PyExc_IndexError, \"cannot fit '%.200s' into an index-sized integer\", Py_TYPE(index)->tp_name);\n    }\n    return NULL;\n}\nstatic PyObject *__Pyx_PyObject_GetItem(PyObject *obj, PyObject* key) {\n    PyMappingMethods *m = Py_TYPE(obj)->tp_as_mapping;\n    if (likely(m && m->mp_subscript)) {\n        return m->mp_subscript(obj, key);\n    }\n    return __Pyx_PyObject_GetIndex(obj, key);\n}\n#endif\n\n/* decode_c_string */\nstatic CYTHON_INLINE PyObject* __Pyx_decode_c_string(\n         const char* cstring, Py_ssize_t start, Py_ssize_t stop,\n         const char* encoding, const char* errors,\n         PyObject* (*decode_func)(const char *s, Py_ssize_t size, const char *errors)) {\n    Py_ssize_t length;\n    if (unlikely((start < 0) | (stop < 0))) {\n        size_t slen = strlen(cstring);\n        if (unlikely(slen > (size_t) PY_SSIZE_T_MAX)) {\n            PyErr_SetString(PyExc_OverflowError,\n                            \"c-string too long to convert to Python\");\n            return NULL;\n        }\n        length = (Py_ssize_t) slen;\n        if (start < 0) {\n            start += length;\n            if (start < 0)\n                start = 0;\n        }\n        if (stop < 0)\n            stop += length;\n    }\n    if (unlikely(stop <= start))\n        return __Pyx_NewRef(__pyx_empty_unicode);\n    length = stop - start;\n    cstring += start;\n    if (decode_func) {\n        return decode_func(cstring, length, errors);\n    } else {\n        return PyUnicode_Decode(cstring, length, encoding, errors);\n    }\n}\n\n/* GetAttr3 */\nstatic PyObject *__Pyx_GetAttr3Default(PyObject *d) {\n    __Pyx_PyThreadState_declare\n    __Pyx_PyThreadState_assign\n    if (unlikely(!__Pyx_PyErr_ExceptionMatches(PyExc_AttributeError)))\n        return NULL;\n    __Pyx_PyErr_Clear();\n    Py_INCREF(d);\n    return d;\n}\nstatic CYTHON_INLINE PyObject *__Pyx_GetAttr3(PyObject *o, PyObject *n, PyObject *d) {\n    PyObject *r = __Pyx_GetAttr(o, n);\n    return (likely(r)) ? r : __Pyx_GetAttr3Default(d);\n}\n\n/* PyDictVersioning */\n#if CYTHON_USE_DICT_VERSIONS && CYTHON_USE_TYPE_SLOTS\nstatic CYTHON_INLINE PY_UINT64_T __Pyx_get_tp_dict_version(PyObject *obj) {\n    PyObject *dict = Py_TYPE(obj)->tp_dict;\n    return likely(dict) ? __PYX_GET_DICT_VERSION(dict) : 0;\n}\nstatic CYTHON_INLINE PY_UINT64_T __Pyx_get_object_dict_version(PyObject *obj) {\n    PyObject **dictptr = NULL;\n    Py_ssize_t offset = Py_TYPE(obj)->tp_dictoffset;\n    if (offset) {\n#if CYTHON_COMPILING_IN_CPYTHON\n        dictptr = (likely(offset > 0)) ? (PyObject **) ((char *)obj + offset) : _PyObject_GetDictPtr(obj);\n#else\n        dictptr = _PyObject_GetDictPtr(obj);\n#endif\n    }\n    return (dictptr && *dictptr) ? __PYX_GET_DICT_VERSION(*dictptr) : 0;\n}\nstatic CYTHON_INLINE int __Pyx_object_dict_version_matches(PyObject* obj, PY_UINT64_T tp_dict_version, PY_UINT64_T obj_dict_version) {\n    PyObject *dict = Py_TYPE(obj)->tp_dict;\n    if (unlikely(!dict) || unlikely(tp_dict_version != __PYX_GET_DICT_VERSION(dict)))\n        return 0;\n    return obj_dict_version == __Pyx_get_object_dict_version(obj);\n}\n#endif\n\n/* GetModuleGlobalName */\n#if CYTHON_USE_DICT_VERSIONS\nstatic PyObject *__Pyx__GetModuleGlobalName(PyObject *name, PY_UINT64_T *dict_version, PyObject **dict_cached_value)\n#else\nstatic CYTHON_INLINE PyObject *__Pyx__GetModuleGlobalName(PyObject *name)\n#endif\n{\n    PyObject *result;\n#if !CYTHON_AVOID_BORROWED_REFS\n#if CYTHON_COMPILING_IN_CPYTHON && PY_VERSION_HEX >= 0x030500A1\n    result = _PyDict_GetItem_KnownHash(__pyx_d, name, ((PyASCIIObject *) name)->hash);\n    __PYX_UPDATE_DICT_CACHE(__pyx_d, result, *dict_cached_value, *dict_version)\n    if (likely(result)) {\n        return __Pyx_NewRef(result);\n    } else if (unlikely(PyErr_Occurred())) {\n        return NULL;\n    }\n#else\n    result = PyDict_GetItem(__pyx_d, name);\n    __PYX_UPDATE_DICT_CACHE(__pyx_d, result, *dict_cached_value, *dict_version)\n    if (likely(result)) {\n        return __Pyx_NewRef(result);\n    }\n#endif\n#else\n    result = PyObject_GetItem(__pyx_d, name);\n    __PYX_UPDATE_DICT_CACHE(__pyx_d, result, *dict_cached_value, *dict_version)\n    if (likely(result)) {\n        return __Pyx_NewRef(result);\n    }\n    PyErr_Clear();\n#endif\n    return __Pyx_GetBuiltinName(name);\n}\n\n/* RaiseTooManyValuesToUnpack */\nstatic CYTHON_INLINE void __Pyx_RaiseTooManyValuesError(Py_ssize_t expected) {\n    PyErr_Format(PyExc_ValueError,\n                 \"too many values to unpack (expected %\" CYTHON_FORMAT_SSIZE_T \"d)\", expected);\n}\n\n/* RaiseNeedMoreValuesToUnpack */\nstatic CYTHON_INLINE void __Pyx_RaiseNeedMoreValuesError(Py_ssize_t index) {\n    PyErr_Format(PyExc_ValueError,\n                 \"need more than %\" CYTHON_FORMAT_SSIZE_T \"d value%.1s to unpack\",\n                 index, (index == 1) ? \"\" : \"s\");\n}\n\n/* RaiseNoneIterError */\nstatic CYTHON_INLINE void __Pyx_RaiseNoneNotIterableError(void) {\n    PyErr_SetString(PyExc_TypeError, \"'NoneType' object is not iterable\");\n}\n\n/* ExtTypeTest */\nstatic CYTHON_INLINE int __Pyx_TypeTest(PyObject *obj, PyTypeObject *type) {\n    if (unlikely(!type)) {\n        PyErr_SetString(PyExc_SystemError, \"Missing type object\");\n        return 0;\n    }\n    if (likely(__Pyx_TypeCheck(obj, type)))\n        return 1;\n    PyErr_Format(PyExc_TypeError, \"Cannot convert %.200s to %.200s\",\n                 Py_TYPE(obj)->tp_name, type->tp_name);\n    return 0;\n}\n\n/* SwapException */\n#if CYTHON_FAST_THREAD_STATE\nstatic CYTHON_INLINE void __Pyx__ExceptionSwap(PyThreadState *tstate, PyObject **type, PyObject **value, PyObject **tb) {\n    PyObject *tmp_type, *tmp_value, *tmp_tb;\n    #if CYTHON_USE_EXC_INFO_STACK\n    _PyErr_StackItem *exc_info = tstate->exc_info;\n    tmp_type = exc_info->exc_type;\n    tmp_value = exc_info->exc_value;\n    tmp_tb = exc_info->exc_traceback;\n    exc_info->exc_type = *type;\n    exc_info->exc_value = *value;\n    exc_info->exc_traceback = *tb;\n    #else\n    tmp_type = tstate->exc_type;\n    tmp_value = tstate->exc_value;\n    tmp_tb = tstate->exc_traceback;\n    tstate->exc_type = *type;\n    tstate->exc_value = *value;\n    tstate->exc_traceback = *tb;\n    #endif\n    *type = tmp_type;\n    *value = tmp_value;\n    *tb = tmp_tb;\n}\n#else\nstatic CYTHON_INLINE void __Pyx_ExceptionSwap(PyObject **type, PyObject **value, PyObject **tb) {\n    PyObject *tmp_type, *tmp_value, *tmp_tb;\n    PyErr_GetExcInfo(&tmp_type, &tmp_value, &tmp_tb);\n    PyErr_SetExcInfo(*type, *value, *tb);\n    *type = tmp_type;\n    *value = tmp_value;\n    *tb = tmp_tb;\n}\n#endif\n\n/* Import */\nstatic PyObject *__Pyx_Import(PyObject *name, PyObject *from_list, int level) {\n    PyObject *empty_list = 0;\n    PyObject *module = 0;\n    PyObject *global_dict = 0;\n    PyObject *empty_dict = 0;\n    PyObject *list;\n    #if PY_MAJOR_VERSION < 3\n    PyObject *py_import;\n    py_import = __Pyx_PyObject_GetAttrStr(__pyx_b, __pyx_n_s_import);\n    if (!py_import)\n        goto bad;\n    #endif\n    if (from_list)\n        list = from_list;\n    else {\n        empty_list = PyList_New(0);\n        if (!empty_list)\n            goto bad;\n        list = empty_list;\n    }\n    global_dict = PyModule_GetDict(__pyx_m);\n    if (!global_dict)\n        goto bad;\n    empty_dict = PyDict_New();\n    if (!empty_dict)\n        goto bad;\n    {\n        #if PY_MAJOR_VERSION >= 3\n        if (level == -1) {\n            if ((1) && (strchr(__Pyx_MODULE_NAME, '.'))) {\n                module = PyImport_ImportModuleLevelObject(\n                    name, global_dict, empty_dict, list, 1);\n                if (!module) {\n                    if (!PyErr_ExceptionMatches(PyExc_ImportError))\n                        goto bad;\n                    PyErr_Clear();\n                }\n            }\n            level = 0;\n        }\n        #endif\n        if (!module) {\n            #if PY_MAJOR_VERSION < 3\n            PyObject *py_level = PyInt_FromLong(level);\n            if (!py_level)\n                goto bad;\n            module = PyObject_CallFunctionObjArgs(py_import,\n                name, global_dict, empty_dict, list, py_level, (PyObject *)NULL);\n            Py_DECREF(py_level);\n            #else\n            module = PyImport_ImportModuleLevelObject(\n                name, global_dict, empty_dict, list, level);\n            #endif\n        }\n    }\nbad:\n    #if PY_MAJOR_VERSION < 3\n    Py_XDECREF(py_import);\n    #endif\n    Py_XDECREF(empty_list);\n    Py_XDECREF(empty_dict);\n    return module;\n}\n\n/* FastTypeChecks */\n#if CYTHON_COMPILING_IN_CPYTHON\nstatic int __Pyx_InBases(PyTypeObject *a, PyTypeObject *b) {\n    while (a) {\n        a = a->tp_base;\n        if (a == b)\n            return 1;\n    }\n    return b == &PyBaseObject_Type;\n}\nstatic CYTHON_INLINE int __Pyx_IsSubtype(PyTypeObject *a, PyTypeObject *b) {\n    PyObject *mro;\n    if (a == b) return 1;\n    mro = a->tp_mro;\n    if (likely(mro)) {\n        Py_ssize_t i, n;\n        n = PyTuple_GET_SIZE(mro);\n        for (i = 0; i < n; i++) {\n            if (PyTuple_GET_ITEM(mro, i) == (PyObject *)b)\n                return 1;\n        }\n        return 0;\n    }\n    return __Pyx_InBases(a, b);\n}\n#if PY_MAJOR_VERSION == 2\nstatic int __Pyx_inner_PyErr_GivenExceptionMatches2(PyObject *err, PyObject* exc_type1, PyObject* exc_type2) {\n    PyObject *exception, *value, *tb;\n    int res;\n    __Pyx_PyThreadState_declare\n    __Pyx_PyThreadState_assign\n    __Pyx_ErrFetch(&exception, &value, &tb);\n    res = exc_type1 ? PyObject_IsSubclass(err, exc_type1) : 0;\n    if (unlikely(res == -1)) {\n        PyErr_WriteUnraisable(err);\n        res = 0;\n    }\n    if (!res) {\n        res = PyObject_IsSubclass(err, exc_type2);\n        if (unlikely(res == -1)) {\n            PyErr_WriteUnraisable(err);\n            res = 0;\n        }\n    }\n    __Pyx_ErrRestore(exception, value, tb);\n    return res;\n}\n#else\nstatic CYTHON_INLINE int __Pyx_inner_PyErr_GivenExceptionMatches2(PyObject *err, PyObject* exc_type1, PyObject *exc_type2) {\n    int res = exc_type1 ? __Pyx_IsSubtype((PyTypeObject*)err, (PyTypeObject*)exc_type1) : 0;\n    if (!res) {\n        res = __Pyx_IsSubtype((PyTypeObject*)err, (PyTypeObject*)exc_type2);\n    }\n    return res;\n}\n#endif\nstatic int __Pyx_PyErr_GivenExceptionMatchesTuple(PyObject *exc_type, PyObject *tuple) {\n    Py_ssize_t i, n;\n    assert(PyExceptionClass_Check(exc_type));\n    n = PyTuple_GET_SIZE(tuple);\n#if PY_MAJOR_VERSION >= 3\n    for (i=0; i<n; i++) {\n        if (exc_type == PyTuple_GET_ITEM(tuple, i)) return 1;\n    }\n#endif\n    for (i=0; i<n; i++) {\n        PyObject *t = PyTuple_GET_ITEM(tuple, i);\n        #if PY_MAJOR_VERSION < 3\n        if (likely(exc_type == t)) return 1;\n        #endif\n        if (likely(PyExceptionClass_Check(t))) {\n            if (__Pyx_inner_PyErr_GivenExceptionMatches2(exc_type, NULL, t)) return 1;\n        } else {\n        }\n    }\n    return 0;\n}\nstatic CYTHON_INLINE int __Pyx_PyErr_GivenExceptionMatches(PyObject *err, PyObject* exc_type) {\n    if (likely(err == exc_type)) return 1;\n    if (likely(PyExceptionClass_Check(err))) {\n        if (likely(PyExceptionClass_Check(exc_type))) {\n            return __Pyx_inner_PyErr_GivenExceptionMatches2(err, NULL, exc_type);\n        } else if (likely(PyTuple_Check(exc_type))) {\n            return __Pyx_PyErr_GivenExceptionMatchesTuple(err, exc_type);\n        } else {\n        }\n    }\n    return PyErr_GivenExceptionMatches(err, exc_type);\n}\nstatic CYTHON_INLINE int __Pyx_PyErr_GivenExceptionMatches2(PyObject *err, PyObject *exc_type1, PyObject *exc_type2) {\n    assert(PyExceptionClass_Check(exc_type1));\n    assert(PyExceptionClass_Check(exc_type2));\n    if (likely(err == exc_type1 || err == exc_type2)) return 1;\n    if (likely(PyExceptionClass_Check(err))) {\n        return __Pyx_inner_PyErr_GivenExceptionMatches2(err, exc_type1, exc_type2);\n    }\n    return (PyErr_GivenExceptionMatches(err, exc_type1) || PyErr_GivenExceptionMatches(err, exc_type2));\n}\n#endif\n\n/* PyIntBinop */\n#if !CYTHON_COMPILING_IN_PYPY\nstatic PyObject* __Pyx_PyInt_AddObjC(PyObject *op1, PyObject *op2, CYTHON_UNUSED long intval, int inplace, int zerodivision_check) {\n    (void)inplace;\n    (void)zerodivision_check;\n    #if PY_MAJOR_VERSION < 3\n    if (likely(PyInt_CheckExact(op1))) {\n        const long b = intval;\n        long x;\n        long a = PyInt_AS_LONG(op1);\n            x = (long)((unsigned long)a + b);\n            if (likely((x^a) >= 0 || (x^b) >= 0))\n                return PyInt_FromLong(x);\n            return PyLong_Type.tp_as_number->nb_add(op1, op2);\n    }\n    #endif\n    #if CYTHON_USE_PYLONG_INTERNALS\n    if (likely(PyLong_CheckExact(op1))) {\n        const long b = intval;\n        long a, x;\n#ifdef HAVE_LONG_LONG\n        const PY_LONG_LONG llb = intval;\n        PY_LONG_LONG lla, llx;\n#endif\n        const digit* digits = ((PyLongObject*)op1)->ob_digit;\n        const Py_ssize_t size = Py_SIZE(op1);\n        if (likely(__Pyx_sst_abs(size) <= 1)) {\n            a = likely(size) ? digits[0] : 0;\n            if (size == -1) a = -a;\n        } else {\n            switch (size) {\n                case -2:\n                    if (8 * sizeof(long) - 1 > 2 * PyLong_SHIFT) {\n                        a = -(long) (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]));\n                        break;\n#ifdef HAVE_LONG_LONG\n                    } else if (8 * sizeof(PY_LONG_LONG) - 1 > 2 * PyLong_SHIFT) {\n                        lla = -(PY_LONG_LONG) (((((unsigned PY_LONG_LONG)digits[1]) << PyLong_SHIFT) | (unsigned PY_LONG_LONG)digits[0]));\n                        goto long_long;\n#endif\n                    }\n                    CYTHON_FALLTHROUGH;\n                case 2:\n                    if (8 * sizeof(long) - 1 > 2 * PyLong_SHIFT) {\n                        a = (long) (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]));\n                        break;\n#ifdef HAVE_LONG_LONG\n                    } else if (8 * sizeof(PY_LONG_LONG) - 1 > 2 * PyLong_SHIFT) {\n                        lla = (PY_LONG_LONG) (((((unsigned PY_LONG_LONG)digits[1]) << PyLong_SHIFT) | (unsigned PY_LONG_LONG)digits[0]));\n                        goto long_long;\n#endif\n                    }\n                    CYTHON_FALLTHROUGH;\n                case -3:\n                    if (8 * sizeof(long) - 1 > 3 * PyLong_SHIFT) {\n                        a = -(long) (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]));\n                        break;\n#ifdef HAVE_LONG_LONG\n                    } else if (8 * sizeof(PY_LONG_LONG) - 1 > 3 * PyLong_SHIFT) {\n                        lla = -(PY_LONG_LONG) (((((((unsigned PY_LONG_LONG)digits[2]) << PyLong_SHIFT) | (unsigned PY_LONG_LONG)digits[1]) << PyLong_SHIFT) | (unsigned PY_LONG_LONG)digits[0]));\n                        goto long_long;\n#endif\n                    }\n                    CYTHON_FALLTHROUGH;\n                case 3:\n                    if (8 * sizeof(long) - 1 > 3 * PyLong_SHIFT) {\n                        a = (long) (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]));\n                        break;\n#ifdef HAVE_LONG_LONG\n                    } else if (8 * sizeof(PY_LONG_LONG) - 1 > 3 * PyLong_SHIFT) {\n                        lla = (PY_LONG_LONG) (((((((unsigned PY_LONG_LONG)digits[2]) << PyLong_SHIFT) | (unsigned PY_LONG_LONG)digits[1]) << PyLong_SHIFT) | (unsigned PY_LONG_LONG)digits[0]));\n                        goto long_long;\n#endif\n                    }\n                    CYTHON_FALLTHROUGH;\n                case -4:\n                    if (8 * sizeof(long) - 1 > 4 * PyLong_SHIFT) {\n                        a = -(long) (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]));\n                        break;\n#ifdef HAVE_LONG_LONG\n                    } else if (8 * sizeof(PY_LONG_LONG) - 1 > 4 * PyLong_SHIFT) {\n                        lla = -(PY_LONG_LONG) (((((((((unsigned PY_LONG_LONG)digits[3]) << PyLong_SHIFT) | (unsigned PY_LONG_LONG)digits[2]) << PyLong_SHIFT) | (unsigned PY_LONG_LONG)digits[1]) << PyLong_SHIFT) | (unsigned PY_LONG_LONG)digits[0]));\n                        goto long_long;\n#endif\n                    }\n                    CYTHON_FALLTHROUGH;\n                case 4:\n                    if (8 * sizeof(long) - 1 > 4 * PyLong_SHIFT) {\n                        a = (long) (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]));\n                        break;\n#ifdef HAVE_LONG_LONG\n                    } else if (8 * sizeof(PY_LONG_LONG) - 1 > 4 * PyLong_SHIFT) {\n                        lla = (PY_LONG_LONG) (((((((((unsigned PY_LONG_LONG)digits[3]) << PyLong_SHIFT) | (unsigned PY_LONG_LONG)digits[2]) << PyLong_SHIFT) | (unsigned PY_LONG_LONG)digits[1]) << PyLong_SHIFT) | (unsigned PY_LONG_LONG)digits[0]));\n                        goto long_long;\n#endif\n                    }\n                    CYTHON_FALLTHROUGH;\n                default: return PyLong_Type.tp_as_number->nb_add(op1, op2);\n            }\n        }\n                x = a + b;\n            return PyLong_FromLong(x);\n#ifdef HAVE_LONG_LONG\n        long_long:\n                llx = lla + llb;\n            return PyLong_FromLongLong(llx);\n#endif\n        \n        \n    }\n    #endif\n    if (PyFloat_CheckExact(op1)) {\n        const long b = intval;\n        double a = PyFloat_AS_DOUBLE(op1);\n            double result;\n            PyFPE_START_PROTECT(\"add\", return NULL)\n            result = ((double)a) + (double)b;\n            PyFPE_END_PROTECT(result)\n            return PyFloat_FromDouble(result);\n    }\n    return (inplace ? PyNumber_InPlaceAdd : PyNumber_Add)(op1, op2);\n}\n#endif\n\n/* DivInt[long] */\nstatic CYTHON_INLINE long __Pyx_div_long(long a, long b) {\n    long q = a / b;\n    long r = a - q*b;\n    q -= ((r != 0) & ((r ^ b) < 0));\n    return q;\n}\n\n/* ImportFrom */\nstatic PyObject* __Pyx_ImportFrom(PyObject* module, PyObject* name) {\n    PyObject* value = __Pyx_PyObject_GetAttrStr(module, name);\n    if (unlikely(!value) && PyErr_ExceptionMatches(PyExc_AttributeError)) {\n        PyErr_Format(PyExc_ImportError,\n        #if PY_MAJOR_VERSION < 3\n            \"cannot import name %.230s\", PyString_AS_STRING(name));\n        #else\n            \"cannot import name %S\", name);\n        #endif\n    }\n    return value;\n}\n\n/* HasAttr */\nstatic CYTHON_INLINE int __Pyx_HasAttr(PyObject *o, PyObject *n) {\n    PyObject *r;\n    if (unlikely(!__Pyx_PyBaseString_Check(n))) {\n        PyErr_SetString(PyExc_TypeError,\n                        \"hasattr(): attribute name must be string\");\n        return -1;\n    }\n    r = __Pyx_GetAttr(o, n);\n    if (unlikely(!r)) {\n        PyErr_Clear();\n        return 0;\n    } else {\n        Py_DECREF(r);\n        return 1;\n    }\n}\n\n/* PyObject_GenericGetAttrNoDict */\n#if CYTHON_USE_TYPE_SLOTS && CYTHON_USE_PYTYPE_LOOKUP && PY_VERSION_HEX < 0x03070000\nstatic PyObject *__Pyx_RaiseGenericGetAttributeError(PyTypeObject *tp, PyObject *attr_name) {\n    PyErr_Format(PyExc_AttributeError,\n#if PY_MAJOR_VERSION >= 3\n                 \"'%.50s' object has no attribute '%U'\",\n                 tp->tp_name, attr_name);\n#else\n                 \"'%.50s' object has no attribute '%.400s'\",\n                 tp->tp_name, PyString_AS_STRING(attr_name));\n#endif\n    return NULL;\n}\nstatic CYTHON_INLINE PyObject* __Pyx_PyObject_GenericGetAttrNoDict(PyObject* obj, PyObject* attr_name) {\n    PyObject *descr;\n    PyTypeObject *tp = Py_TYPE(obj);\n    if (unlikely(!PyString_Check(attr_name))) {\n        return PyObject_GenericGetAttr(obj, attr_name);\n    }\n    assert(!tp->tp_dictoffset);\n    descr = _PyType_Lookup(tp, attr_name);\n    if (unlikely(!descr)) {\n        return __Pyx_RaiseGenericGetAttributeError(tp, attr_name);\n    }\n    Py_INCREF(descr);\n    #if PY_MAJOR_VERSION < 3\n    if (likely(PyType_HasFeature(Py_TYPE(descr), Py_TPFLAGS_HAVE_CLASS)))\n    #endif\n    {\n        descrgetfunc f = Py_TYPE(descr)->tp_descr_get;\n        if (unlikely(f)) {\n            PyObject *res = f(descr, obj, (PyObject *)tp);\n            Py_DECREF(descr);\n            return res;\n        }\n    }\n    return descr;\n}\n#endif\n\n/* PyObject_GenericGetAttr */\n#if CYTHON_USE_TYPE_SLOTS && CYTHON_USE_PYTYPE_LOOKUP && PY_VERSION_HEX < 0x03070000\nstatic PyObject* __Pyx_PyObject_GenericGetAttr(PyObject* obj, PyObject* attr_name) {\n    if (unlikely(Py_TYPE(obj)->tp_dictoffset)) {\n        return PyObject_GenericGetAttr(obj, attr_name);\n    }\n    return __Pyx_PyObject_GenericGetAttrNoDict(obj, attr_name);\n}\n#endif\n\n/* SetVTable */\nstatic int __Pyx_SetVtable(PyObject *dict, void *vtable) {\n#if PY_VERSION_HEX >= 0x02070000\n    PyObject *ob = PyCapsule_New(vtable, 0, 0);\n#else\n    PyObject *ob = PyCObject_FromVoidPtr(vtable, 0);\n#endif\n    if (!ob)\n        goto bad;\n    if (PyDict_SetItem(dict, __pyx_n_s_pyx_vtable, ob) < 0)\n        goto bad;\n    Py_DECREF(ob);\n    return 0;\nbad:\n    Py_XDECREF(ob);\n    return -1;\n}\n\n/* PyObjectGetAttrStrNoError */\nstatic void __Pyx_PyObject_GetAttrStr_ClearAttributeError(void) {\n    __Pyx_PyThreadState_declare\n    __Pyx_PyThreadState_assign\n    if (likely(__Pyx_PyErr_ExceptionMatches(PyExc_AttributeError)))\n        __Pyx_PyErr_Clear();\n}\nstatic CYTHON_INLINE PyObject* __Pyx_PyObject_GetAttrStrNoError(PyObject* obj, PyObject* attr_name) {\n    PyObject *result;\n#if CYTHON_COMPILING_IN_CPYTHON && CYTHON_USE_TYPE_SLOTS && PY_VERSION_HEX >= 0x030700B1\n    PyTypeObject* tp = Py_TYPE(obj);\n    if (likely(tp->tp_getattro == PyObject_GenericGetAttr)) {\n        return _PyObject_GenericGetAttrWithDict(obj, attr_name, NULL, 1);\n    }\n#endif\n    result = __Pyx_PyObject_GetAttrStr(obj, attr_name);\n    if (unlikely(!result)) {\n        __Pyx_PyObject_GetAttrStr_ClearAttributeError();\n    }\n    return result;\n}\n\n/* SetupReduce */\nstatic int __Pyx_setup_reduce_is_named(PyObject* meth, PyObject* name) {\n  int ret;\n  PyObject *name_attr;\n  name_attr = __Pyx_PyObject_GetAttrStr(meth, __pyx_n_s_name_2);\n  if (likely(name_attr)) {\n      ret = PyObject_RichCompareBool(name_attr, name, Py_EQ);\n  } else {\n      ret = -1;\n  }\n  if (unlikely(ret < 0)) {\n      PyErr_Clear();\n      ret = 0;\n  }\n  Py_XDECREF(name_attr);\n  return ret;\n}\nstatic int __Pyx_setup_reduce(PyObject* type_obj) {\n    int ret = 0;\n    PyObject *object_reduce = NULL;\n    PyObject *object_reduce_ex = NULL;\n    PyObject *reduce = NULL;\n    PyObject *reduce_ex = NULL;\n    PyObject *reduce_cython = NULL;\n    PyObject *setstate = NULL;\n    PyObject *setstate_cython = NULL;\n#if CYTHON_USE_PYTYPE_LOOKUP\n    if (_PyType_Lookup((PyTypeObject*)type_obj, __pyx_n_s_getstate)) goto __PYX_GOOD;\n#else\n    if (PyObject_HasAttr(type_obj, __pyx_n_s_getstate)) goto __PYX_GOOD;\n#endif\n#if CYTHON_USE_PYTYPE_LOOKUP\n    object_reduce_ex = _PyType_Lookup(&PyBaseObject_Type, __pyx_n_s_reduce_ex); if (!object_reduce_ex) goto __PYX_BAD;\n#else\n    object_reduce_ex = __Pyx_PyObject_GetAttrStr((PyObject*)&PyBaseObject_Type, __pyx_n_s_reduce_ex); if (!object_reduce_ex) goto __PYX_BAD;\n#endif\n    reduce_ex = __Pyx_PyObject_GetAttrStr(type_obj, __pyx_n_s_reduce_ex); if (unlikely(!reduce_ex)) goto __PYX_BAD;\n    if (reduce_ex == object_reduce_ex) {\n#if CYTHON_USE_PYTYPE_LOOKUP\n        object_reduce = _PyType_Lookup(&PyBaseObject_Type, __pyx_n_s_reduce); if (!object_reduce) goto __PYX_BAD;\n#else\n        object_reduce = __Pyx_PyObject_GetAttrStr((PyObject*)&PyBaseObject_Type, __pyx_n_s_reduce); if (!object_reduce) goto __PYX_BAD;\n#endif\n        reduce = __Pyx_PyObject_GetAttrStr(type_obj, __pyx_n_s_reduce); if (unlikely(!reduce)) goto __PYX_BAD;\n        if (reduce == object_reduce || __Pyx_setup_reduce_is_named(reduce, __pyx_n_s_reduce_cython)) {\n            reduce_cython = __Pyx_PyObject_GetAttrStrNoError(type_obj, __pyx_n_s_reduce_cython);\n            if (likely(reduce_cython)) {\n                ret = PyDict_SetItem(((PyTypeObject*)type_obj)->tp_dict, __pyx_n_s_reduce, reduce_cython); if (unlikely(ret < 0)) goto __PYX_BAD;\n                ret = PyDict_DelItem(((PyTypeObject*)type_obj)->tp_dict, __pyx_n_s_reduce_cython); if (unlikely(ret < 0)) goto __PYX_BAD;\n            } else if (reduce == object_reduce || PyErr_Occurred()) {\n                goto __PYX_BAD;\n            }\n            setstate = __Pyx_PyObject_GetAttrStr(type_obj, __pyx_n_s_setstate);\n            if (!setstate) PyErr_Clear();\n            if (!setstate || __Pyx_setup_reduce_is_named(setstate, __pyx_n_s_setstate_cython)) {\n                setstate_cython = __Pyx_PyObject_GetAttrStrNoError(type_obj, __pyx_n_s_setstate_cython);\n                if (likely(setstate_cython)) {\n                    ret = PyDict_SetItem(((PyTypeObject*)type_obj)->tp_dict, __pyx_n_s_setstate, setstate_cython); if (unlikely(ret < 0)) goto __PYX_BAD;\n                    ret = PyDict_DelItem(((PyTypeObject*)type_obj)->tp_dict, __pyx_n_s_setstate_cython); if (unlikely(ret < 0)) goto __PYX_BAD;\n                } else if (!setstate || PyErr_Occurred()) {\n                    goto __PYX_BAD;\n                }\n            }\n            PyType_Modified((PyTypeObject*)type_obj);\n        }\n    }\n    goto __PYX_GOOD;\n__PYX_BAD:\n    if (!PyErr_Occurred())\n        PyErr_Format(PyExc_RuntimeError, \"Unable to initialize pickling for %s\", ((PyTypeObject*)type_obj)->tp_name);\n    ret = -1;\n__PYX_GOOD:\n#if !CYTHON_USE_PYTYPE_LOOKUP\n    Py_XDECREF(object_reduce);\n    Py_XDECREF(object_reduce_ex);\n#endif\n    Py_XDECREF(reduce);\n    Py_XDECREF(reduce_ex);\n    Py_XDECREF(reduce_cython);\n    Py_XDECREF(setstate);\n    Py_XDECREF(setstate_cython);\n    return ret;\n}\n\n/* TypeImport */\n#ifndef __PYX_HAVE_RT_ImportType\n#define __PYX_HAVE_RT_ImportType\nstatic PyTypeObject *__Pyx_ImportType(PyObject *module, const char *module_name, const char *class_name,\n    size_t size, enum __Pyx_ImportType_CheckSize check_size)\n{\n    PyObject *result = 0;\n    char warning[200];\n    Py_ssize_t basicsize;\n#ifdef Py_LIMITED_API\n    PyObject *py_basicsize;\n#endif\n    result = PyObject_GetAttrString(module, class_name);\n    if (!result)\n        goto bad;\n    if (!PyType_Check(result)) {\n        PyErr_Format(PyExc_TypeError,\n            \"%.200s.%.200s is not a type object\",\n            module_name, class_name);\n        goto bad;\n    }\n#ifndef Py_LIMITED_API\n    basicsize = ((PyTypeObject *)result)->tp_basicsize;\n#else\n    py_basicsize = PyObject_GetAttrString(result, \"__basicsize__\");\n    if (!py_basicsize)\n        goto bad;\n    basicsize = PyLong_AsSsize_t(py_basicsize);\n    Py_DECREF(py_basicsize);\n    py_basicsize = 0;\n    if (basicsize == (Py_ssize_t)-1 && PyErr_Occurred())\n        goto bad;\n#endif\n    if ((size_t)basicsize < size) {\n        PyErr_Format(PyExc_ValueError,\n            \"%.200s.%.200s size changed, may indicate binary incompatibility. \"\n            \"Expected %zd from C header, got %zd from PyObject\",\n            module_name, class_name, size, basicsize);\n        goto bad;\n    }\n    if (check_size == __Pyx_ImportType_CheckSize_Error && (size_t)basicsize != size) {\n        PyErr_Format(PyExc_ValueError,\n            \"%.200s.%.200s size changed, may indicate binary incompatibility. \"\n            \"Expected %zd from C header, got %zd from PyObject\",\n            module_name, class_name, size, basicsize);\n        goto bad;\n    }\n    else if (check_size == __Pyx_ImportType_CheckSize_Warn && (size_t)basicsize > size) {\n        PyOS_snprintf(warning, sizeof(warning),\n            \"%s.%s size changed, may indicate binary incompatibility. \"\n            \"Expected %zd from C header, got %zd from PyObject\",\n            module_name, class_name, size, basicsize);\n        if (PyErr_WarnEx(NULL, warning, 0) < 0) goto bad;\n    }\n    return (PyTypeObject *)result;\nbad:\n    Py_XDECREF(result);\n    return NULL;\n}\n#endif\n\n/* CLineInTraceback */\n#ifndef CYTHON_CLINE_IN_TRACEBACK\nstatic int __Pyx_CLineForTraceback(CYTHON_NCP_UNUSED PyThreadState *tstate, int c_line) {\n    PyObject *use_cline;\n    PyObject *ptype, *pvalue, *ptraceback;\n#if CYTHON_COMPILING_IN_CPYTHON\n    PyObject **cython_runtime_dict;\n#endif\n    if (unlikely(!__pyx_cython_runtime)) {\n        return c_line;\n    }\n    __Pyx_ErrFetchInState(tstate, &ptype, &pvalue, &ptraceback);\n#if CYTHON_COMPILING_IN_CPYTHON\n    cython_runtime_dict = _PyObject_GetDictPtr(__pyx_cython_runtime);\n    if (likely(cython_runtime_dict)) {\n        __PYX_PY_DICT_LOOKUP_IF_MODIFIED(\n            use_cline, *cython_runtime_dict,\n            __Pyx_PyDict_GetItemStr(*cython_runtime_dict, __pyx_n_s_cline_in_traceback))\n    } else\n#endif\n    {\n      PyObject *use_cline_obj = __Pyx_PyObject_GetAttrStr(__pyx_cython_runtime, __pyx_n_s_cline_in_traceback);\n      if (use_cline_obj) {\n        use_cline = PyObject_Not(use_cline_obj) ? Py_False : Py_True;\n        Py_DECREF(use_cline_obj);\n      } else {\n        PyErr_Clear();\n        use_cline = NULL;\n      }\n    }\n    if (!use_cline) {\n        c_line = 0;\n        (void) PyObject_SetAttr(__pyx_cython_runtime, __pyx_n_s_cline_in_traceback, Py_False);\n    }\n    else if (use_cline == Py_False || (use_cline != Py_True && PyObject_Not(use_cline) != 0)) {\n        c_line = 0;\n    }\n    __Pyx_ErrRestoreInState(tstate, ptype, pvalue, ptraceback);\n    return c_line;\n}\n#endif\n\n/* CodeObjectCache */\nstatic int __pyx_bisect_code_objects(__Pyx_CodeObjectCacheEntry* entries, int count, int code_line) {\n    int start = 0, mid = 0, end = count - 1;\n    if (end >= 0 && code_line > entries[end].code_line) {\n        return count;\n    }\n    while (start < end) {\n        mid = start + (end - start) / 2;\n        if (code_line < entries[mid].code_line) {\n            end = mid;\n        } else if (code_line > entries[mid].code_line) {\n             start = mid + 1;\n        } else {\n            return mid;\n        }\n    }\n    if (code_line <= entries[mid].code_line) {\n        return mid;\n    } else {\n        return mid + 1;\n    }\n}\nstatic PyCodeObject *__pyx_find_code_object(int code_line) {\n    PyCodeObject* code_object;\n    int pos;\n    if (unlikely(!code_line) || unlikely(!__pyx_code_cache.entries)) {\n        return NULL;\n    }\n    pos = __pyx_bisect_code_objects(__pyx_code_cache.entries, __pyx_code_cache.count, code_line);\n    if (unlikely(pos >= __pyx_code_cache.count) || unlikely(__pyx_code_cache.entries[pos].code_line != code_line)) {\n        return NULL;\n    }\n    code_object = __pyx_code_cache.entries[pos].code_object;\n    Py_INCREF(code_object);\n    return code_object;\n}\nstatic void __pyx_insert_code_object(int code_line, PyCodeObject* code_object) {\n    int pos, i;\n    __Pyx_CodeObjectCacheEntry* entries = __pyx_code_cache.entries;\n    if (unlikely(!code_line)) {\n        return;\n    }\n    if (unlikely(!entries)) {\n        entries = (__Pyx_CodeObjectCacheEntry*)PyMem_Malloc(64*sizeof(__Pyx_CodeObjectCacheEntry));\n        if (likely(entries)) {\n            __pyx_code_cache.entries = entries;\n            __pyx_code_cache.max_count = 64;\n            __pyx_code_cache.count = 1;\n            entries[0].code_line = code_line;\n            entries[0].code_object = code_object;\n            Py_INCREF(code_object);\n        }\n        return;\n    }\n    pos = __pyx_bisect_code_objects(__pyx_code_cache.entries, __pyx_code_cache.count, code_line);\n    if ((pos < __pyx_code_cache.count) && unlikely(__pyx_code_cache.entries[pos].code_line == code_line)) {\n        PyCodeObject* tmp = entries[pos].code_object;\n        entries[pos].code_object = code_object;\n        Py_DECREF(tmp);\n        return;\n    }\n    if (__pyx_code_cache.count == __pyx_code_cache.max_count) {\n        int new_max = __pyx_code_cache.max_count + 64;\n        entries = (__Pyx_CodeObjectCacheEntry*)PyMem_Realloc(\n            __pyx_code_cache.entries, ((size_t)new_max) * sizeof(__Pyx_CodeObjectCacheEntry));\n        if (unlikely(!entries)) {\n            return;\n        }\n        __pyx_code_cache.entries = entries;\n        __pyx_code_cache.max_count = new_max;\n    }\n    for (i=__pyx_code_cache.count; i>pos; i--) {\n        entries[i] = entries[i-1];\n    }\n    entries[pos].code_line = code_line;\n    entries[pos].code_object = code_object;\n    __pyx_code_cache.count++;\n    Py_INCREF(code_object);\n}\n\n/* AddTraceback */\n#include \"compile.h\"\n#include \"frameobject.h\"\n#include \"traceback.h\"\nstatic PyCodeObject* __Pyx_CreateCodeObjectForTraceback(\n            const char *funcname, int c_line,\n            int py_line, const char *filename) {\n    PyCodeObject *py_code = NULL;\n    PyObject *py_funcname = NULL;\n    #if PY_MAJOR_VERSION < 3\n    PyObject *py_srcfile = NULL;\n    py_srcfile = PyString_FromString(filename);\n    if (!py_srcfile) goto bad;\n    #endif\n    if (c_line) {\n        #if PY_MAJOR_VERSION < 3\n        py_funcname = PyString_FromFormat( \"%s (%s:%d)\", funcname, __pyx_cfilenm, c_line);\n        if (!py_funcname) goto bad;\n        #else\n        py_funcname = PyUnicode_FromFormat( \"%s (%s:%d)\", funcname, __pyx_cfilenm, c_line);\n        if (!py_funcname) goto bad;\n        funcname = PyUnicode_AsUTF8(py_funcname);\n        if (!funcname) goto bad;\n        #endif\n    }\n    else {\n        #if PY_MAJOR_VERSION < 3\n        py_funcname = PyString_FromString(funcname);\n        if (!py_funcname) goto bad;\n        #endif\n    }\n    #if PY_MAJOR_VERSION < 3\n    py_code = __Pyx_PyCode_New(\n        0,\n        0,\n        0,\n        0,\n        0,\n        __pyx_empty_bytes, /*PyObject *code,*/\n        __pyx_empty_tuple, /*PyObject *consts,*/\n        __pyx_empty_tuple, /*PyObject *names,*/\n        __pyx_empty_tuple, /*PyObject *varnames,*/\n        __pyx_empty_tuple, /*PyObject *freevars,*/\n        __pyx_empty_tuple, /*PyObject *cellvars,*/\n        py_srcfile,   /*PyObject *filename,*/\n        py_funcname,  /*PyObject *name,*/\n        py_line,\n        __pyx_empty_bytes  /*PyObject *lnotab*/\n    );\n    Py_DECREF(py_srcfile);\n    #else\n    py_code = PyCode_NewEmpty(filename, funcname, py_line);\n    #endif\n    Py_XDECREF(py_funcname);  // XDECREF since it's only set on Py3 if cline\n    return py_code;\nbad:\n    Py_XDECREF(py_funcname);\n    #if PY_MAJOR_VERSION < 3\n    Py_XDECREF(py_srcfile);\n    #endif\n    return NULL;\n}\nstatic void __Pyx_AddTraceback(const char *funcname, int c_line,\n                               int py_line, const char *filename) {\n    PyCodeObject *py_code = 0;\n    PyFrameObject *py_frame = 0;\n    PyThreadState *tstate = __Pyx_PyThreadState_Current;\n    if (c_line) {\n        c_line = __Pyx_CLineForTraceback(tstate, c_line);\n    }\n    py_code = __pyx_find_code_object(c_line ? -c_line : py_line);\n    if (!py_code) {\n        py_code = __Pyx_CreateCodeObjectForTraceback(\n            funcname, c_line, py_line, filename);\n        if (!py_code) goto bad;\n        __pyx_insert_code_object(c_line ? -c_line : py_line, py_code);\n    }\n    py_frame = PyFrame_New(\n        tstate,            /*PyThreadState *tstate,*/\n        py_code,           /*PyCodeObject *code,*/\n        __pyx_d,    /*PyObject *globals,*/\n        0                  /*PyObject *locals*/\n    );\n    if (!py_frame) goto bad;\n    __Pyx_PyFrame_SetLineNumber(py_frame, py_line);\n    PyTraceBack_Here(py_frame);\nbad:\n    Py_XDECREF(py_code);\n    Py_XDECREF(py_frame);\n}\n\n#if PY_MAJOR_VERSION < 3\nstatic int __Pyx_GetBuffer(PyObject *obj, Py_buffer *view, int flags) {\n    if (PyObject_CheckBuffer(obj)) return PyObject_GetBuffer(obj, view, flags);\n        if (__Pyx_TypeCheck(obj, __pyx_array_type)) return __pyx_array_getbuffer(obj, view, flags);\n        if (__Pyx_TypeCheck(obj, __pyx_memoryview_type)) return __pyx_memoryview_getbuffer(obj, view, flags);\n    PyErr_Format(PyExc_TypeError, \"'%.200s' does not have the buffer interface\", Py_TYPE(obj)->tp_name);\n    return -1;\n}\nstatic void __Pyx_ReleaseBuffer(Py_buffer *view) {\n    PyObject *obj = view->obj;\n    if (!obj) return;\n    if (PyObject_CheckBuffer(obj)) {\n        PyBuffer_Release(view);\n        return;\n    }\n    if ((0)) {}\n    view->obj = NULL;\n    Py_DECREF(obj);\n}\n#endif\n\n\n/* MemviewSliceIsContig */\nstatic int\n__pyx_memviewslice_is_contig(const __Pyx_memviewslice mvs, char order, int ndim)\n{\n    int i, index, step, start;\n    Py_ssize_t itemsize = mvs.memview->view.itemsize;\n    if (order == 'F') {\n        step = 1;\n        start = 0;\n    } else {\n        step = -1;\n        start = ndim - 1;\n    }\n    for (i = 0; i < ndim; i++) {\n        index = start + step * i;\n        if (mvs.suboffsets[index] >= 0 || mvs.strides[index] != itemsize)\n            return 0;\n        itemsize *= mvs.shape[index];\n    }\n    return 1;\n}\n\n/* OverlappingSlices */\nstatic void\n__pyx_get_array_memory_extents(__Pyx_memviewslice *slice,\n                               void **out_start, void **out_end,\n                               int ndim, size_t itemsize)\n{\n    char *start, *end;\n    int i;\n    start = end = slice->data;\n    for (i = 0; i < ndim; i++) {\n        Py_ssize_t stride = slice->strides[i];\n        Py_ssize_t extent = slice->shape[i];\n        if (extent == 0) {\n            *out_start = *out_end = start;\n            return;\n        } else {\n            if (stride > 0)\n                end += stride * (extent - 1);\n            else\n                start += stride * (extent - 1);\n        }\n    }\n    *out_start = start;\n    *out_end = end + itemsize;\n}\nstatic int\n__pyx_slices_overlap(__Pyx_memviewslice *slice1,\n                     __Pyx_memviewslice *slice2,\n                     int ndim, size_t itemsize)\n{\n    void *start1, *end1, *start2, *end2;\n    __pyx_get_array_memory_extents(slice1, &start1, &end1, ndim, itemsize);\n    __pyx_get_array_memory_extents(slice2, &start2, &end2, ndim, itemsize);\n    return (start1 < end2) && (start2 < end1);\n}\n\n/* Capsule */\nstatic CYTHON_INLINE PyObject *\n__pyx_capsule_create(void *p, CYTHON_UNUSED const char *sig)\n{\n    PyObject *cobj;\n#if PY_VERSION_HEX >= 0x02070000\n    cobj = PyCapsule_New(p, sig, NULL);\n#else\n    cobj = PyCObject_FromVoidPtr(p, NULL);\n#endif\n    return cobj;\n}\n\n/* IsLittleEndian */\nstatic CYTHON_INLINE int __Pyx_Is_Little_Endian(void)\n{\n  union {\n    uint32_t u32;\n    uint8_t u8[4];\n  } S;\n  S.u32 = 0x01020304;\n  return S.u8[0] == 4;\n}\n\n/* BufferFormatCheck */\nstatic void __Pyx_BufFmt_Init(__Pyx_BufFmt_Context* ctx,\n                              __Pyx_BufFmt_StackElem* stack,\n                              __Pyx_TypeInfo* type) {\n  stack[0].field = &ctx->root;\n  stack[0].parent_offset = 0;\n  ctx->root.type = type;\n  ctx->root.name = \"buffer dtype\";\n  ctx->root.offset = 0;\n  ctx->head = stack;\n  ctx->head->field = &ctx->root;\n  ctx->fmt_offset = 0;\n  ctx->head->parent_offset = 0;\n  ctx->new_packmode = '@';\n  ctx->enc_packmode = '@';\n  ctx->new_count = 1;\n  ctx->enc_count = 0;\n  ctx->enc_type = 0;\n  ctx->is_complex = 0;\n  ctx->is_valid_array = 0;\n  ctx->struct_alignment = 0;\n  while (type->typegroup == 'S') {\n    ++ctx->head;\n    ctx->head->field = type->fields;\n    ctx->head->parent_offset = 0;\n    type = type->fields->type;\n  }\n}\nstatic int __Pyx_BufFmt_ParseNumber(const char** ts) {\n    int count;\n    const char* t = *ts;\n    if (*t < '0' || *t > '9') {\n      return -1;\n    } else {\n        count = *t++ - '0';\n        while (*t >= '0' && *t <= '9') {\n            count *= 10;\n            count += *t++ - '0';\n        }\n    }\n    *ts = t;\n    return count;\n}\nstatic int __Pyx_BufFmt_ExpectNumber(const char **ts) {\n    int number = __Pyx_BufFmt_ParseNumber(ts);\n    if (number == -1)\n        PyErr_Format(PyExc_ValueError,\\\n                     \"Does not understand character buffer dtype format string ('%c')\", **ts);\n    return number;\n}\nstatic void __Pyx_BufFmt_RaiseUnexpectedChar(char ch) {\n  PyErr_Format(PyExc_ValueError,\n               \"Unexpected format string character: '%c'\", ch);\n}\nstatic const char* __Pyx_BufFmt_DescribeTypeChar(char ch, int is_complex) {\n  switch (ch) {\n    case '?': return \"'bool'\";\n    case 'c': return \"'char'\";\n    case 'b': return \"'signed char'\";\n    case 'B': return \"'unsigned char'\";\n    case 'h': return \"'short'\";\n    case 'H': return \"'unsigned short'\";\n    case 'i': return \"'int'\";\n    case 'I': return \"'unsigned int'\";\n    case 'l': return \"'long'\";\n    case 'L': return \"'unsigned long'\";\n    case 'q': return \"'long long'\";\n    case 'Q': return \"'unsigned long long'\";\n    case 'f': return (is_complex ? \"'complex float'\" : \"'float'\");\n    case 'd': return (is_complex ? \"'complex double'\" : \"'double'\");\n    case 'g': return (is_complex ? \"'complex long double'\" : \"'long double'\");\n    case 'T': return \"a struct\";\n    case 'O': return \"Python object\";\n    case 'P': return \"a pointer\";\n    case 's': case 'p': return \"a string\";\n    case 0: return \"end\";\n    default: return \"unparseable format string\";\n  }\n}\nstatic size_t __Pyx_BufFmt_TypeCharToStandardSize(char ch, int is_complex) {\n  switch (ch) {\n    case '?': case 'c': case 'b': case 'B': case 's': case 'p': return 1;\n    case 'h': case 'H': return 2;\n    case 'i': case 'I': case 'l': case 'L': return 4;\n    case 'q': case 'Q': return 8;\n    case 'f': return (is_complex ? 8 : 4);\n    case 'd': return (is_complex ? 16 : 8);\n    case 'g': {\n      PyErr_SetString(PyExc_ValueError, \"Python does not define a standard format string size for long double ('g')..\");\n      return 0;\n    }\n    case 'O': case 'P': return sizeof(void*);\n    default:\n      __Pyx_BufFmt_RaiseUnexpectedChar(ch);\n      return 0;\n    }\n}\nstatic size_t __Pyx_BufFmt_TypeCharToNativeSize(char ch, int is_complex) {\n  switch (ch) {\n    case '?': case 'c': case 'b': case 'B': case 's': case 'p': return 1;\n    case 'h': case 'H': return sizeof(short);\n    case 'i': case 'I': return sizeof(int);\n    case 'l': case 'L': return sizeof(long);\n    #ifdef HAVE_LONG_LONG\n    case 'q': case 'Q': return sizeof(PY_LONG_LONG);\n    #endif\n    case 'f': return sizeof(float) * (is_complex ? 2 : 1);\n    case 'd': return sizeof(double) * (is_complex ? 2 : 1);\n    case 'g': return sizeof(long double) * (is_complex ? 2 : 1);\n    case 'O': case 'P': return sizeof(void*);\n    default: {\n      __Pyx_BufFmt_RaiseUnexpectedChar(ch);\n      return 0;\n    }\n  }\n}\ntypedef struct { char c; short x; } __Pyx_st_short;\ntypedef struct { char c; int x; } __Pyx_st_int;\ntypedef struct { char c; long x; } __Pyx_st_long;\ntypedef struct { char c; float x; } __Pyx_st_float;\ntypedef struct { char c; double x; } __Pyx_st_double;\ntypedef struct { char c; long double x; } __Pyx_st_longdouble;\ntypedef struct { char c; void *x; } __Pyx_st_void_p;\n#ifdef HAVE_LONG_LONG\ntypedef struct { char c; PY_LONG_LONG x; } __Pyx_st_longlong;\n#endif\nstatic size_t __Pyx_BufFmt_TypeCharToAlignment(char ch, CYTHON_UNUSED int is_complex) {\n  switch (ch) {\n    case '?': case 'c': case 'b': case 'B': case 's': case 'p': return 1;\n    case 'h': case 'H': return sizeof(__Pyx_st_short) - sizeof(short);\n    case 'i': case 'I': return sizeof(__Pyx_st_int) - sizeof(int);\n    case 'l': case 'L': return sizeof(__Pyx_st_long) - sizeof(long);\n#ifdef HAVE_LONG_LONG\n    case 'q': case 'Q': return sizeof(__Pyx_st_longlong) - sizeof(PY_LONG_LONG);\n#endif\n    case 'f': return sizeof(__Pyx_st_float) - sizeof(float);\n    case 'd': return sizeof(__Pyx_st_double) - sizeof(double);\n    case 'g': return sizeof(__Pyx_st_longdouble) - sizeof(long double);\n    case 'P': case 'O': return sizeof(__Pyx_st_void_p) - sizeof(void*);\n    default:\n      __Pyx_BufFmt_RaiseUnexpectedChar(ch);\n      return 0;\n    }\n}\n/* These are for computing the padding at the end of the struct to align\n   on the first member of the struct. This will probably the same as above,\n   but we don't have any guarantees.\n */\ntypedef struct { short x; char c; } __Pyx_pad_short;\ntypedef struct { int x; char c; } __Pyx_pad_int;\ntypedef struct { long x; char c; } __Pyx_pad_long;\ntypedef struct { float x; char c; } __Pyx_pad_float;\ntypedef struct { double x; char c; } __Pyx_pad_double;\ntypedef struct { long double x; char c; } __Pyx_pad_longdouble;\ntypedef struct { void *x; char c; } __Pyx_pad_void_p;\n#ifdef HAVE_LONG_LONG\ntypedef struct { PY_LONG_LONG x; char c; } __Pyx_pad_longlong;\n#endif\nstatic size_t __Pyx_BufFmt_TypeCharToPadding(char ch, CYTHON_UNUSED int is_complex) {\n  switch (ch) {\n    case '?': case 'c': case 'b': case 'B': case 's': case 'p': return 1;\n    case 'h': case 'H': return sizeof(__Pyx_pad_short) - sizeof(short);\n    case 'i': case 'I': return sizeof(__Pyx_pad_int) - sizeof(int);\n    case 'l': case 'L': return sizeof(__Pyx_pad_long) - sizeof(long);\n#ifdef HAVE_LONG_LONG\n    case 'q': case 'Q': return sizeof(__Pyx_pad_longlong) - sizeof(PY_LONG_LONG);\n#endif\n    case 'f': return sizeof(__Pyx_pad_float) - sizeof(float);\n    case 'd': return sizeof(__Pyx_pad_double) - sizeof(double);\n    case 'g': return sizeof(__Pyx_pad_longdouble) - sizeof(long double);\n    case 'P': case 'O': return sizeof(__Pyx_pad_void_p) - sizeof(void*);\n    default:\n      __Pyx_BufFmt_RaiseUnexpectedChar(ch);\n      return 0;\n    }\n}\nstatic char __Pyx_BufFmt_TypeCharToGroup(char ch, int is_complex) {\n  switch (ch) {\n    case 'c':\n        return 'H';\n    case 'b': case 'h': case 'i':\n    case 'l': case 'q': case 's': case 'p':\n        return 'I';\n    case '?': case 'B': case 'H': case 'I': case 'L': case 'Q':\n        return 'U';\n    case 'f': case 'd': case 'g':\n        return (is_complex ? 'C' : 'R');\n    case 'O':\n        return 'O';\n    case 'P':\n        return 'P';\n    default: {\n      __Pyx_BufFmt_RaiseUnexpectedChar(ch);\n      return 0;\n    }\n  }\n}\nstatic void __Pyx_BufFmt_RaiseExpected(__Pyx_BufFmt_Context* ctx) {\n  if (ctx->head == NULL || ctx->head->field == &ctx->root) {\n    const char* expected;\n    const char* quote;\n    if (ctx->head == NULL) {\n      expected = \"end\";\n      quote = \"\";\n    } else {\n      expected = ctx->head->field->type->name;\n      quote = \"'\";\n    }\n    PyErr_Format(PyExc_ValueError,\n                 \"Buffer dtype mismatch, expected %s%s%s but got %s\",\n                 quote, expected, quote,\n                 __Pyx_BufFmt_DescribeTypeChar(ctx->enc_type, ctx->is_complex));\n  } else {\n    __Pyx_StructField* field = ctx->head->field;\n    __Pyx_StructField* parent = (ctx->head - 1)->field;\n    PyErr_Format(PyExc_ValueError,\n                 \"Buffer dtype mismatch, expected '%s' but got %s in '%s.%s'\",\n                 field->type->name, __Pyx_BufFmt_DescribeTypeChar(ctx->enc_type, ctx->is_complex),\n                 parent->type->name, field->name);\n  }\n}\nstatic int __Pyx_BufFmt_ProcessTypeChunk(__Pyx_BufFmt_Context* ctx) {\n  char group;\n  size_t size, offset, arraysize = 1;\n  if (ctx->enc_type == 0) return 0;\n  if (ctx->head->field->type->arraysize[0]) {\n    int i, ndim = 0;\n    if (ctx->enc_type == 's' || ctx->enc_type == 'p') {\n        ctx->is_valid_array = ctx->head->field->type->ndim == 1;\n        ndim = 1;\n        if (ctx->enc_count != ctx->head->field->type->arraysize[0]) {\n            PyErr_Format(PyExc_ValueError,\n                         \"Expected a dimension of size %zu, got %zu\",\n                         ctx->head->field->type->arraysize[0], ctx->enc_count);\n            return -1;\n        }\n    }\n    if (!ctx->is_valid_array) {\n      PyErr_Format(PyExc_ValueError, \"Expected %d dimensions, got %d\",\n                   ctx->head->field->type->ndim, ndim);\n      return -1;\n    }\n    for (i = 0; i < ctx->head->field->type->ndim; i++) {\n      arraysize *= ctx->head->field->type->arraysize[i];\n    }\n    ctx->is_valid_array = 0;\n    ctx->enc_count = 1;\n  }\n  group = __Pyx_BufFmt_TypeCharToGroup(ctx->enc_type, ctx->is_complex);\n  do {\n    __Pyx_StructField* field = ctx->head->field;\n    __Pyx_TypeInfo* type = field->type;\n    if (ctx->enc_packmode == '@' || ctx->enc_packmode == '^') {\n      size = __Pyx_BufFmt_TypeCharToNativeSize(ctx->enc_type, ctx->is_complex);\n    } else {\n      size = __Pyx_BufFmt_TypeCharToStandardSize(ctx->enc_type, ctx->is_complex);\n    }\n    if (ctx->enc_packmode == '@') {\n      size_t align_at = __Pyx_BufFmt_TypeCharToAlignment(ctx->enc_type, ctx->is_complex);\n      size_t align_mod_offset;\n      if (align_at == 0) return -1;\n      align_mod_offset = ctx->fmt_offset % align_at;\n      if (align_mod_offset > 0) ctx->fmt_offset += align_at - align_mod_offset;\n      if (ctx->struct_alignment == 0)\n          ctx->struct_alignment = __Pyx_BufFmt_TypeCharToPadding(ctx->enc_type,\n                                                                 ctx->is_complex);\n    }\n    if (type->size != size || type->typegroup != group) {\n      if (type->typegroup == 'C' && type->fields != NULL) {\n        size_t parent_offset = ctx->head->parent_offset + field->offset;\n        ++ctx->head;\n        ctx->head->field = type->fields;\n        ctx->head->parent_offset = parent_offset;\n        continue;\n      }\n      if ((type->typegroup == 'H' || group == 'H') && type->size == size) {\n      } else {\n          __Pyx_BufFmt_RaiseExpected(ctx);\n          return -1;\n      }\n    }\n    offset = ctx->head->parent_offset + field->offset;\n    if (ctx->fmt_offset != offset) {\n      PyErr_Format(PyExc_ValueError,\n                   \"Buffer dtype mismatch; next field is at offset %\" CYTHON_FORMAT_SSIZE_T \"d but %\" CYTHON_FORMAT_SSIZE_T \"d expected\",\n                   (Py_ssize_t)ctx->fmt_offset, (Py_ssize_t)offset);\n      return -1;\n    }\n    ctx->fmt_offset += size;\n    if (arraysize)\n      ctx->fmt_offset += (arraysize - 1) * size;\n    --ctx->enc_count;\n    while (1) {\n      if (field == &ctx->root) {\n        ctx->head = NULL;\n        if (ctx->enc_count != 0) {\n          __Pyx_BufFmt_RaiseExpected(ctx);\n          return -1;\n        }\n        break;\n      }\n      ctx->head->field = ++field;\n      if (field->type == NULL) {\n        --ctx->head;\n        field = ctx->head->field;\n        continue;\n      } else if (field->type->typegroup == 'S') {\n        size_t parent_offset = ctx->head->parent_offset + field->offset;\n        if (field->type->fields->type == NULL) continue;\n        field = field->type->fields;\n        ++ctx->head;\n        ctx->head->field = field;\n        ctx->head->parent_offset = parent_offset;\n        break;\n      } else {\n        break;\n      }\n    }\n  } while (ctx->enc_count);\n  ctx->enc_type = 0;\n  ctx->is_complex = 0;\n  return 0;\n}\nstatic PyObject *\n__pyx_buffmt_parse_array(__Pyx_BufFmt_Context* ctx, const char** tsp)\n{\n    const char *ts = *tsp;\n    int i = 0, number, ndim;\n    ++ts;\n    if (ctx->new_count != 1) {\n        PyErr_SetString(PyExc_ValueError,\n                        \"Cannot handle repeated arrays in format string\");\n        return NULL;\n    }\n    if (__Pyx_BufFmt_ProcessTypeChunk(ctx) == -1) return NULL;\n    ndim = ctx->head->field->type->ndim;\n    while (*ts && *ts != ')') {\n        switch (*ts) {\n            case ' ': case '\\f': case '\\r': case '\\n': case '\\t': case '\\v':  continue;\n            default:  break;\n        }\n        number = __Pyx_BufFmt_ExpectNumber(&ts);\n        if (number == -1) return NULL;\n        if (i < ndim && (size_t) number != ctx->head->field->type->arraysize[i])\n            return PyErr_Format(PyExc_ValueError,\n                        \"Expected a dimension of size %zu, got %d\",\n                        ctx->head->field->type->arraysize[i], number);\n        if (*ts != ',' && *ts != ')')\n            return PyErr_Format(PyExc_ValueError,\n                                \"Expected a comma in format string, got '%c'\", *ts);\n        if (*ts == ',') ts++;\n        i++;\n    }\n    if (i != ndim)\n        return PyErr_Format(PyExc_ValueError, \"Expected %d dimension(s), got %d\",\n                            ctx->head->field->type->ndim, i);\n    if (!*ts) {\n        PyErr_SetString(PyExc_ValueError,\n                        \"Unexpected end of format string, expected ')'\");\n        return NULL;\n    }\n    ctx->is_valid_array = 1;\n    ctx->new_count = 1;\n    *tsp = ++ts;\n    return Py_None;\n}\nstatic const char* __Pyx_BufFmt_CheckString(__Pyx_BufFmt_Context* ctx, const char* ts) {\n  int got_Z = 0;\n  while (1) {\n    switch(*ts) {\n      case 0:\n        if (ctx->enc_type != 0 && ctx->head == NULL) {\n          __Pyx_BufFmt_RaiseExpected(ctx);\n          return NULL;\n        }\n        if (__Pyx_BufFmt_ProcessTypeChunk(ctx) == -1) return NULL;\n        if (ctx->head != NULL) {\n          __Pyx_BufFmt_RaiseExpected(ctx);\n          return NULL;\n        }\n        return ts;\n      case ' ':\n      case '\\r':\n      case '\\n':\n        ++ts;\n        break;\n      case '<':\n        if (!__Pyx_Is_Little_Endian()) {\n          PyErr_SetString(PyExc_ValueError, \"Little-endian buffer not supported on big-endian compiler\");\n          return NULL;\n        }\n        ctx->new_packmode = '=';\n        ++ts;\n        break;\n      case '>':\n      case '!':\n        if (__Pyx_Is_Little_Endian()) {\n          PyErr_SetString(PyExc_ValueError, \"Big-endian buffer not supported on little-endian compiler\");\n          return NULL;\n        }\n        ctx->new_packmode = '=';\n        ++ts;\n        break;\n      case '=':\n      case '@':\n      case '^':\n        ctx->new_packmode = *ts++;\n        break;\n      case 'T':\n        {\n          const char* ts_after_sub;\n          size_t i, struct_count = ctx->new_count;\n          size_t struct_alignment = ctx->struct_alignment;\n          ctx->new_count = 1;\n          ++ts;\n          if (*ts != '{') {\n            PyErr_SetString(PyExc_ValueError, \"Buffer acquisition: Expected '{' after 'T'\");\n            return NULL;\n          }\n          if (__Pyx_BufFmt_ProcessTypeChunk(ctx) == -1) return NULL;\n          ctx->enc_type = 0;\n          ctx->enc_count = 0;\n          ctx->struct_alignment = 0;\n          ++ts;\n          ts_after_sub = ts;\n          for (i = 0; i != struct_count; ++i) {\n            ts_after_sub = __Pyx_BufFmt_CheckString(ctx, ts);\n            if (!ts_after_sub) return NULL;\n          }\n          ts = ts_after_sub;\n          if (struct_alignment) ctx->struct_alignment = struct_alignment;\n        }\n        break;\n      case '}':\n        {\n          size_t alignment = ctx->struct_alignment;\n          ++ts;\n          if (__Pyx_BufFmt_ProcessTypeChunk(ctx) == -1) return NULL;\n          ctx->enc_type = 0;\n          if (alignment && ctx->fmt_offset % alignment) {\n            ctx->fmt_offset += alignment - (ctx->fmt_offset % alignment);\n          }\n        }\n        return ts;\n      case 'x':\n        if (__Pyx_BufFmt_ProcessTypeChunk(ctx) == -1) return NULL;\n        ctx->fmt_offset += ctx->new_count;\n        ctx->new_count = 1;\n        ctx->enc_count = 0;\n        ctx->enc_type = 0;\n        ctx->enc_packmode = ctx->new_packmode;\n        ++ts;\n        break;\n      case 'Z':\n        got_Z = 1;\n        ++ts;\n        if (*ts != 'f' && *ts != 'd' && *ts != 'g') {\n          __Pyx_BufFmt_RaiseUnexpectedChar('Z');\n          return NULL;\n        }\n        CYTHON_FALLTHROUGH;\n      case '?': case 'c': case 'b': case 'B': case 'h': case 'H': case 'i': case 'I':\n      case 'l': case 'L': case 'q': case 'Q':\n      case 'f': case 'd': case 'g':\n      case 'O': case 'p':\n        if ((ctx->enc_type == *ts) && (got_Z == ctx->is_complex) &&\n            (ctx->enc_packmode == ctx->new_packmode) && (!ctx->is_valid_array)) {\n          ctx->enc_count += ctx->new_count;\n          ctx->new_count = 1;\n          got_Z = 0;\n          ++ts;\n          break;\n        }\n        CYTHON_FALLTHROUGH;\n      case 's':\n        if (__Pyx_BufFmt_ProcessTypeChunk(ctx) == -1) return NULL;\n        ctx->enc_count = ctx->new_count;\n        ctx->enc_packmode = ctx->new_packmode;\n        ctx->enc_type = *ts;\n        ctx->is_complex = got_Z;\n        ++ts;\n        ctx->new_count = 1;\n        got_Z = 0;\n        break;\n      case ':':\n        ++ts;\n        while(*ts != ':') ++ts;\n        ++ts;\n        break;\n      case '(':\n        if (!__pyx_buffmt_parse_array(ctx, &ts)) return NULL;\n        break;\n      default:\n        {\n          int number = __Pyx_BufFmt_ExpectNumber(&ts);\n          if (number == -1) return NULL;\n          ctx->new_count = (size_t)number;\n        }\n    }\n  }\n}\n\n/* TypeInfoCompare */\n  static int\n__pyx_typeinfo_cmp(__Pyx_TypeInfo *a, __Pyx_TypeInfo *b)\n{\n    int i;\n    if (!a || !b)\n        return 0;\n    if (a == b)\n        return 1;\n    if (a->size != b->size || a->typegroup != b->typegroup ||\n            a->is_unsigned != b->is_unsigned || a->ndim != b->ndim) {\n        if (a->typegroup == 'H' || b->typegroup == 'H') {\n            return a->size == b->size;\n        } else {\n            return 0;\n        }\n    }\n    if (a->ndim) {\n        for (i = 0; i < a->ndim; i++)\n            if (a->arraysize[i] != b->arraysize[i])\n                return 0;\n    }\n    if (a->typegroup == 'S') {\n        if (a->flags != b->flags)\n            return 0;\n        if (a->fields || b->fields) {\n            if (!(a->fields && b->fields))\n                return 0;\n            for (i = 0; a->fields[i].type && b->fields[i].type; i++) {\n                __Pyx_StructField *field_a = a->fields + i;\n                __Pyx_StructField *field_b = b->fields + i;\n                if (field_a->offset != field_b->offset ||\n                    !__pyx_typeinfo_cmp(field_a->type, field_b->type))\n                    return 0;\n            }\n            return !a->fields[i].type && !b->fields[i].type;\n        }\n    }\n    return 1;\n}\n\n/* MemviewSliceValidateAndInit */\n  static int\n__pyx_check_strides(Py_buffer *buf, int dim, int ndim, int spec)\n{\n    if (buf->shape[dim] <= 1)\n        return 1;\n    if (buf->strides) {\n        if (spec & __Pyx_MEMVIEW_CONTIG) {\n            if (spec & (__Pyx_MEMVIEW_PTR|__Pyx_MEMVIEW_FULL)) {\n                if (unlikely(buf->strides[dim] != sizeof(void *))) {\n                    PyErr_Format(PyExc_ValueError,\n                                 \"Buffer is not indirectly contiguous \"\n                                 \"in dimension %d.\", dim);\n                    goto fail;\n                }\n            } else if (unlikely(buf->strides[dim] != buf->itemsize)) {\n                PyErr_SetString(PyExc_ValueError,\n                                \"Buffer and memoryview are not contiguous \"\n                                \"in the same dimension.\");\n                goto fail;\n            }\n        }\n        if (spec & __Pyx_MEMVIEW_FOLLOW) {\n            Py_ssize_t stride = buf->strides[dim];\n            if (stride < 0)\n                stride = -stride;\n            if (unlikely(stride < buf->itemsize)) {\n                PyErr_SetString(PyExc_ValueError,\n                                \"Buffer and memoryview are not contiguous \"\n                                \"in the same dimension.\");\n                goto fail;\n            }\n        }\n    } else {\n        if (unlikely(spec & __Pyx_MEMVIEW_CONTIG && dim != ndim - 1)) {\n            PyErr_Format(PyExc_ValueError,\n                         \"C-contiguous buffer is not contiguous in \"\n                         \"dimension %d\", dim);\n            goto fail;\n        } else if (unlikely(spec & (__Pyx_MEMVIEW_PTR))) {\n            PyErr_Format(PyExc_ValueError,\n                         \"C-contiguous buffer is not indirect in \"\n                         \"dimension %d\", dim);\n            goto fail;\n        } else if (unlikely(buf->suboffsets)) {\n            PyErr_SetString(PyExc_ValueError,\n                            \"Buffer exposes suboffsets but no strides\");\n            goto fail;\n        }\n    }\n    return 1;\nfail:\n    return 0;\n}\nstatic int\n__pyx_check_suboffsets(Py_buffer *buf, int dim, CYTHON_UNUSED int ndim, int spec)\n{\n    if (spec & __Pyx_MEMVIEW_DIRECT) {\n        if (unlikely(buf->suboffsets && buf->suboffsets[dim] >= 0)) {\n            PyErr_Format(PyExc_ValueError,\n                         \"Buffer not compatible with direct access \"\n                         \"in dimension %d.\", dim);\n            goto fail;\n        }\n    }\n    if (spec & __Pyx_MEMVIEW_PTR) {\n        if (unlikely(!buf->suboffsets || (buf->suboffsets[dim] < 0))) {\n            PyErr_Format(PyExc_ValueError,\n                         \"Buffer is not indirectly accessible \"\n                         \"in dimension %d.\", dim);\n            goto fail;\n        }\n    }\n    return 1;\nfail:\n    return 0;\n}\nstatic int\n__pyx_verify_contig(Py_buffer *buf, int ndim, int c_or_f_flag)\n{\n    int i;\n    if (c_or_f_flag & __Pyx_IS_F_CONTIG) {\n        Py_ssize_t stride = 1;\n        for (i = 0; i < ndim; i++) {\n            if (unlikely(stride * buf->itemsize != buf->strides[i]  &&  buf->shape[i] > 1)) {\n                PyErr_SetString(PyExc_ValueError,\n                    \"Buffer not fortran contiguous.\");\n                goto fail;\n            }\n            stride = stride * buf->shape[i];\n        }\n    } else if (c_or_f_flag & __Pyx_IS_C_CONTIG) {\n        Py_ssize_t stride = 1;\n        for (i = ndim - 1; i >- 1; i--) {\n            if (unlikely(stride * buf->itemsize != buf->strides[i]  &&  buf->shape[i] > 1)) {\n                PyErr_SetString(PyExc_ValueError,\n                    \"Buffer not C contiguous.\");\n                goto fail;\n            }\n            stride = stride * buf->shape[i];\n        }\n    }\n    return 1;\nfail:\n    return 0;\n}\nstatic int __Pyx_ValidateAndInit_memviewslice(\n                int *axes_specs,\n                int c_or_f_flag,\n                int buf_flags,\n                int ndim,\n                __Pyx_TypeInfo *dtype,\n                __Pyx_BufFmt_StackElem stack[],\n                __Pyx_memviewslice *memviewslice,\n                PyObject *original_obj)\n{\n    struct __pyx_memoryview_obj *memview, *new_memview;\n    __Pyx_RefNannyDeclarations\n    Py_buffer *buf;\n    int i, spec = 0, retval = -1;\n    __Pyx_BufFmt_Context ctx;\n    int from_memoryview = __pyx_memoryview_check(original_obj);\n    __Pyx_RefNannySetupContext(\"ValidateAndInit_memviewslice\", 0);\n    if (from_memoryview && __pyx_typeinfo_cmp(dtype, ((struct __pyx_memoryview_obj *)\n                                                            original_obj)->typeinfo)) {\n        memview = (struct __pyx_memoryview_obj *) original_obj;\n        new_memview = NULL;\n    } else {\n        memview = (struct __pyx_memoryview_obj *) __pyx_memoryview_new(\n                                            original_obj, buf_flags, 0, dtype);\n        new_memview = memview;\n        if (unlikely(!memview))\n            goto fail;\n    }\n    buf = &memview->view;\n    if (unlikely(buf->ndim != ndim)) {\n        PyErr_Format(PyExc_ValueError,\n                \"Buffer has wrong number of dimensions (expected %d, got %d)\",\n                ndim, buf->ndim);\n        goto fail;\n    }\n    if (new_memview) {\n        __Pyx_BufFmt_Init(&ctx, stack, dtype);\n        if (unlikely(!__Pyx_BufFmt_CheckString(&ctx, buf->format))) goto fail;\n    }\n    if (unlikely((unsigned) buf->itemsize != dtype->size)) {\n        PyErr_Format(PyExc_ValueError,\n                     \"Item size of buffer (%\" CYTHON_FORMAT_SSIZE_T \"u byte%s) \"\n                     \"does not match size of '%s' (%\" CYTHON_FORMAT_SSIZE_T \"u byte%s)\",\n                     buf->itemsize,\n                     (buf->itemsize > 1) ? \"s\" : \"\",\n                     dtype->name,\n                     dtype->size,\n                     (dtype->size > 1) ? \"s\" : \"\");\n        goto fail;\n    }\n    if (buf->len > 0) {\n        for (i = 0; i < ndim; i++) {\n            spec = axes_specs[i];\n            if (unlikely(!__pyx_check_strides(buf, i, ndim, spec)))\n                goto fail;\n            if (unlikely(!__pyx_check_suboffsets(buf, i, ndim, spec)))\n                goto fail;\n        }\n        if (unlikely(buf->strides && !__pyx_verify_contig(buf, ndim, c_or_f_flag)))\n            goto fail;\n    }\n    if (unlikely(__Pyx_init_memviewslice(memview, ndim, memviewslice,\n                                         new_memview != NULL) == -1)) {\n        goto fail;\n    }\n    retval = 0;\n    goto no_fail;\nfail:\n    Py_XDECREF(new_memview);\n    retval = -1;\nno_fail:\n    __Pyx_RefNannyFinishContext();\n    return retval;\n}\n\n/* ObjectToMemviewSlice */\n  static CYTHON_INLINE __Pyx_memviewslice __Pyx_PyObject_to_MemoryviewSlice_d_d_dc_int(PyObject *obj, int writable_flag) {\n    __Pyx_memviewslice result = { 0, 0, { 0 }, { 0 }, { 0 } };\n    __Pyx_BufFmt_StackElem stack[1];\n    int axes_specs[] = { (__Pyx_MEMVIEW_DIRECT | __Pyx_MEMVIEW_FOLLOW), (__Pyx_MEMVIEW_DIRECT | __Pyx_MEMVIEW_FOLLOW), (__Pyx_MEMVIEW_DIRECT | __Pyx_MEMVIEW_CONTIG) };\n    int retcode;\n    if (obj == Py_None) {\n        result.memview = (struct __pyx_memoryview_obj *) Py_None;\n        return result;\n    }\n    retcode = __Pyx_ValidateAndInit_memviewslice(axes_specs, __Pyx_IS_C_CONTIG,\n                                                 (PyBUF_C_CONTIGUOUS | PyBUF_FORMAT) | writable_flag, 3,\n                                                 &__Pyx_TypeInfo_int, stack,\n                                                 &result, obj);\n    if (unlikely(retcode == -1))\n        goto __pyx_fail;\n    return result;\n__pyx_fail:\n    result.memview = NULL;\n    result.data = NULL;\n    return result;\n}\n\n/* ObjectToMemviewSlice */\n  static CYTHON_INLINE __Pyx_memviewslice __Pyx_PyObject_to_MemoryviewSlice_d_d_dc_float(PyObject *obj, int writable_flag) {\n    __Pyx_memviewslice result = { 0, 0, { 0 }, { 0 }, { 0 } };\n    __Pyx_BufFmt_StackElem stack[1];\n    int axes_specs[] = { (__Pyx_MEMVIEW_DIRECT | __Pyx_MEMVIEW_FOLLOW), (__Pyx_MEMVIEW_DIRECT | __Pyx_MEMVIEW_FOLLOW), (__Pyx_MEMVIEW_DIRECT | __Pyx_MEMVIEW_CONTIG) };\n    int retcode;\n    if (obj == Py_None) {\n        result.memview = (struct __pyx_memoryview_obj *) Py_None;\n        return result;\n    }\n    retcode = __Pyx_ValidateAndInit_memviewslice(axes_specs, __Pyx_IS_C_CONTIG,\n                                                 (PyBUF_C_CONTIGUOUS | PyBUF_FORMAT) | writable_flag, 3,\n                                                 &__Pyx_TypeInfo_float, stack,\n                                                 &result, obj);\n    if (unlikely(retcode == -1))\n        goto __pyx_fail;\n    return result;\n__pyx_fail:\n    result.memview = NULL;\n    result.data = NULL;\n    return result;\n}\n\n/* ObjectToMemviewSlice */\n  static CYTHON_INLINE __Pyx_memviewslice __Pyx_PyObject_to_MemoryviewSlice_dc_int(PyObject *obj, int writable_flag) {\n    __Pyx_memviewslice result = { 0, 0, { 0 }, { 0 }, { 0 } };\n    __Pyx_BufFmt_StackElem stack[1];\n    int axes_specs[] = { (__Pyx_MEMVIEW_DIRECT | __Pyx_MEMVIEW_CONTIG) };\n    int retcode;\n    if (obj == Py_None) {\n        result.memview = (struct __pyx_memoryview_obj *) Py_None;\n        return result;\n    }\n    retcode = __Pyx_ValidateAndInit_memviewslice(axes_specs, __Pyx_IS_C_CONTIG,\n                                                 (PyBUF_C_CONTIGUOUS | PyBUF_FORMAT) | writable_flag, 1,\n                                                 &__Pyx_TypeInfo_int, stack,\n                                                 &result, obj);\n    if (unlikely(retcode == -1))\n        goto __pyx_fail;\n    return result;\n__pyx_fail:\n    result.memview = NULL;\n    result.data = NULL;\n    return result;\n}\n\n/* CIntFromPyVerify */\n  #define __PYX_VERIFY_RETURN_INT(target_type, func_type, func_value)\\\n    __PYX__VERIFY_RETURN_INT(target_type, func_type, func_value, 0)\n#define __PYX_VERIFY_RETURN_INT_EXC(target_type, func_type, func_value)\\\n    __PYX__VERIFY_RETURN_INT(target_type, func_type, func_value, 1)\n#define __PYX__VERIFY_RETURN_INT(target_type, func_type, func_value, exc)\\\n    {\\\n        func_type value = func_value;\\\n        if (sizeof(target_type) < sizeof(func_type)) {\\\n            if (unlikely(value != (func_type) (target_type) value)) {\\\n                func_type zero = 0;\\\n                if (exc && unlikely(value == (func_type)-1 && PyErr_Occurred()))\\\n                    return (target_type) -1;\\\n                if (is_unsigned && unlikely(value < zero))\\\n                    goto raise_neg_overflow;\\\n                else\\\n                    goto raise_overflow;\\\n            }\\\n        }\\\n        return (target_type) value;\\\n    }\n\n/* Declarations */\n  #if CYTHON_CCOMPLEX\n  #ifdef __cplusplus\n    static CYTHON_INLINE __pyx_t_float_complex __pyx_t_float_complex_from_parts(float x, float y) {\n      return ::std::complex< float >(x, y);\n    }\n  #else\n    static CYTHON_INLINE __pyx_t_float_complex __pyx_t_float_complex_from_parts(float x, float y) {\n      return x + y*(__pyx_t_float_complex)_Complex_I;\n    }\n  #endif\n#else\n    static CYTHON_INLINE __pyx_t_float_complex __pyx_t_float_complex_from_parts(float x, float y) {\n      __pyx_t_float_complex z;\n      z.real = x;\n      z.imag = y;\n      return z;\n    }\n#endif\n\n/* Arithmetic */\n  #if CYTHON_CCOMPLEX\n#else\n    static CYTHON_INLINE int __Pyx_c_eq_float(__pyx_t_float_complex a, __pyx_t_float_complex b) {\n       return (a.real == b.real) && (a.imag == b.imag);\n    }\n    static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_sum_float(__pyx_t_float_complex a, __pyx_t_float_complex b) {\n        __pyx_t_float_complex z;\n        z.real = a.real + b.real;\n        z.imag = a.imag + b.imag;\n        return z;\n    }\n    static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_diff_float(__pyx_t_float_complex a, __pyx_t_float_complex b) {\n        __pyx_t_float_complex z;\n        z.real = a.real - b.real;\n        z.imag = a.imag - b.imag;\n        return z;\n    }\n    static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_prod_float(__pyx_t_float_complex a, __pyx_t_float_complex b) {\n        __pyx_t_float_complex z;\n        z.real = a.real * b.real - a.imag * b.imag;\n        z.imag = a.real * b.imag + a.imag * b.real;\n        return z;\n    }\n    #if 1\n    static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_quot_float(__pyx_t_float_complex a, __pyx_t_float_complex b) {\n        if (b.imag == 0) {\n            return __pyx_t_float_complex_from_parts(a.real / b.real, a.imag / b.real);\n        } else if (fabsf(b.real) >= fabsf(b.imag)) {\n            if (b.real == 0 && b.imag == 0) {\n                return __pyx_t_float_complex_from_parts(a.real / b.real, a.imag / b.imag);\n            } else {\n                float r = b.imag / b.real;\n                float s = (float)(1.0) / (b.real + b.imag * r);\n                return __pyx_t_float_complex_from_parts(\n                    (a.real + a.imag * r) * s, (a.imag - a.real * r) * s);\n            }\n        } else {\n            float r = b.real / b.imag;\n            float s = (float)(1.0) / (b.imag + b.real * r);\n            return __pyx_t_float_complex_from_parts(\n                (a.real * r + a.imag) * s, (a.imag * r - a.real) * s);\n        }\n    }\n    #else\n    static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_quot_float(__pyx_t_float_complex a, __pyx_t_float_complex b) {\n        if (b.imag == 0) {\n            return __pyx_t_float_complex_from_parts(a.real / b.real, a.imag / b.real);\n        } else {\n            float denom = b.real * b.real + b.imag * b.imag;\n            return __pyx_t_float_complex_from_parts(\n                (a.real * b.real + a.imag * b.imag) / denom,\n                (a.imag * b.real - a.real * b.imag) / denom);\n        }\n    }\n    #endif\n    static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_neg_float(__pyx_t_float_complex a) {\n        __pyx_t_float_complex z;\n        z.real = -a.real;\n        z.imag = -a.imag;\n        return z;\n    }\n    static CYTHON_INLINE int __Pyx_c_is_zero_float(__pyx_t_float_complex a) {\n       return (a.real == 0) && (a.imag == 0);\n    }\n    static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_conj_float(__pyx_t_float_complex a) {\n        __pyx_t_float_complex z;\n        z.real =  a.real;\n        z.imag = -a.imag;\n        return z;\n    }\n    #if 1\n        static CYTHON_INLINE float __Pyx_c_abs_float(__pyx_t_float_complex z) {\n          #if !defined(HAVE_HYPOT) || defined(_MSC_VER)\n            return sqrtf(z.real*z.real + z.imag*z.imag);\n          #else\n            return hypotf(z.real, z.imag);\n          #endif\n        }\n        static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_pow_float(__pyx_t_float_complex a, __pyx_t_float_complex b) {\n            __pyx_t_float_complex z;\n            float r, lnr, theta, z_r, z_theta;\n            if (b.imag == 0 && b.real == (int)b.real) {\n                if (b.real < 0) {\n                    float denom = a.real * a.real + a.imag * a.imag;\n                    a.real = a.real / denom;\n                    a.imag = -a.imag / denom;\n                    b.real = -b.real;\n                }\n                switch ((int)b.real) {\n                    case 0:\n                        z.real = 1;\n                        z.imag = 0;\n                        return z;\n                    case 1:\n                        return a;\n                    case 2:\n                        return __Pyx_c_prod_float(a, a);\n                    case 3:\n                        z = __Pyx_c_prod_float(a, a);\n                        return __Pyx_c_prod_float(z, a);\n                    case 4:\n                        z = __Pyx_c_prod_float(a, a);\n                        return __Pyx_c_prod_float(z, z);\n                }\n            }\n            if (a.imag == 0) {\n                if (a.real == 0) {\n                    return a;\n                } else if (b.imag == 0) {\n                    z.real = powf(a.real, b.real);\n                    z.imag = 0;\n                    return z;\n                } else if (a.real > 0) {\n                    r = a.real;\n                    theta = 0;\n                } else {\n                    r = -a.real;\n                    theta = atan2f(0.0, -1.0);\n                }\n            } else {\n                r = __Pyx_c_abs_float(a);\n                theta = atan2f(a.imag, a.real);\n            }\n            lnr = logf(r);\n            z_r = expf(lnr * b.real - theta * b.imag);\n            z_theta = theta * b.real + lnr * b.imag;\n            z.real = z_r * cosf(z_theta);\n            z.imag = z_r * sinf(z_theta);\n            return z;\n        }\n    #endif\n#endif\n\n/* Declarations */\n  #if CYTHON_CCOMPLEX\n  #ifdef __cplusplus\n    static CYTHON_INLINE __pyx_t_double_complex __pyx_t_double_complex_from_parts(double x, double y) {\n      return ::std::complex< double >(x, y);\n    }\n  #else\n    static CYTHON_INLINE __pyx_t_double_complex __pyx_t_double_complex_from_parts(double x, double y) {\n      return x + y*(__pyx_t_double_complex)_Complex_I;\n    }\n  #endif\n#else\n    static CYTHON_INLINE __pyx_t_double_complex __pyx_t_double_complex_from_parts(double x, double y) {\n      __pyx_t_double_complex z;\n      z.real = x;\n      z.imag = y;\n      return z;\n    }\n#endif\n\n/* Arithmetic */\n  #if CYTHON_CCOMPLEX\n#else\n    static CYTHON_INLINE int __Pyx_c_eq_double(__pyx_t_double_complex a, __pyx_t_double_complex b) {\n       return (a.real == b.real) && (a.imag == b.imag);\n    }\n    static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_sum_double(__pyx_t_double_complex a, __pyx_t_double_complex b) {\n        __pyx_t_double_complex z;\n        z.real = a.real + b.real;\n        z.imag = a.imag + b.imag;\n        return z;\n    }\n    static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_diff_double(__pyx_t_double_complex a, __pyx_t_double_complex b) {\n        __pyx_t_double_complex z;\n        z.real = a.real - b.real;\n        z.imag = a.imag - b.imag;\n        return z;\n    }\n    static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_prod_double(__pyx_t_double_complex a, __pyx_t_double_complex b) {\n        __pyx_t_double_complex z;\n        z.real = a.real * b.real - a.imag * b.imag;\n        z.imag = a.real * b.imag + a.imag * b.real;\n        return z;\n    }\n    #if 1\n    static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_quot_double(__pyx_t_double_complex a, __pyx_t_double_complex b) {\n        if (b.imag == 0) {\n            return __pyx_t_double_complex_from_parts(a.real / b.real, a.imag / b.real);\n        } else if (fabs(b.real) >= fabs(b.imag)) {\n            if (b.real == 0 && b.imag == 0) {\n                return __pyx_t_double_complex_from_parts(a.real / b.real, a.imag / b.imag);\n            } else {\n                double r = b.imag / b.real;\n                double s = (double)(1.0) / (b.real + b.imag * r);\n                return __pyx_t_double_complex_from_parts(\n                    (a.real + a.imag * r) * s, (a.imag - a.real * r) * s);\n            }\n        } else {\n            double r = b.real / b.imag;\n            double s = (double)(1.0) / (b.imag + b.real * r);\n            return __pyx_t_double_complex_from_parts(\n                (a.real * r + a.imag) * s, (a.imag * r - a.real) * s);\n        }\n    }\n    #else\n    static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_quot_double(__pyx_t_double_complex a, __pyx_t_double_complex b) {\n        if (b.imag == 0) {\n            return __pyx_t_double_complex_from_parts(a.real / b.real, a.imag / b.real);\n        } else {\n            double denom = b.real * b.real + b.imag * b.imag;\n            return __pyx_t_double_complex_from_parts(\n                (a.real * b.real + a.imag * b.imag) / denom,\n                (a.imag * b.real - a.real * b.imag) / denom);\n        }\n    }\n    #endif\n    static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_neg_double(__pyx_t_double_complex a) {\n        __pyx_t_double_complex z;\n        z.real = -a.real;\n        z.imag = -a.imag;\n        return z;\n    }\n    static CYTHON_INLINE int __Pyx_c_is_zero_double(__pyx_t_double_complex a) {\n       return (a.real == 0) && (a.imag == 0);\n    }\n    static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_conj_double(__pyx_t_double_complex a) {\n        __pyx_t_double_complex z;\n        z.real =  a.real;\n        z.imag = -a.imag;\n        return z;\n    }\n    #if 1\n        static CYTHON_INLINE double __Pyx_c_abs_double(__pyx_t_double_complex z) {\n          #if !defined(HAVE_HYPOT) || defined(_MSC_VER)\n            return sqrt(z.real*z.real + z.imag*z.imag);\n          #else\n            return hypot(z.real, z.imag);\n          #endif\n        }\n        static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_pow_double(__pyx_t_double_complex a, __pyx_t_double_complex b) {\n            __pyx_t_double_complex z;\n            double r, lnr, theta, z_r, z_theta;\n            if (b.imag == 0 && b.real == (int)b.real) {\n                if (b.real < 0) {\n                    double denom = a.real * a.real + a.imag * a.imag;\n                    a.real = a.real / denom;\n                    a.imag = -a.imag / denom;\n                    b.real = -b.real;\n                }\n                switch ((int)b.real) {\n                    case 0:\n                        z.real = 1;\n                        z.imag = 0;\n                        return z;\n                    case 1:\n                        return a;\n                    case 2:\n                        return __Pyx_c_prod_double(a, a);\n                    case 3:\n                        z = __Pyx_c_prod_double(a, a);\n                        return __Pyx_c_prod_double(z, a);\n                    case 4:\n                        z = __Pyx_c_prod_double(a, a);\n                        return __Pyx_c_prod_double(z, z);\n                }\n            }\n            if (a.imag == 0) {\n                if (a.real == 0) {\n                    return a;\n                } else if (b.imag == 0) {\n                    z.real = pow(a.real, b.real);\n                    z.imag = 0;\n                    return z;\n                } else if (a.real > 0) {\n                    r = a.real;\n                    theta = 0;\n                } else {\n                    r = -a.real;\n                    theta = atan2(0.0, -1.0);\n                }\n            } else {\n                r = __Pyx_c_abs_double(a);\n                theta = atan2(a.imag, a.real);\n            }\n            lnr = log(r);\n            z_r = exp(lnr * b.real - theta * b.imag);\n            z_theta = theta * b.real + lnr * b.imag;\n            z.real = z_r * cos(z_theta);\n            z.imag = z_r * sin(z_theta);\n            return z;\n        }\n    #endif\n#endif\n\n/* MemviewSliceCopyTemplate */\n  static __Pyx_memviewslice\n__pyx_memoryview_copy_new_contig(const __Pyx_memviewslice *from_mvs,\n                                 const char *mode, int ndim,\n                                 size_t sizeof_dtype, int contig_flag,\n                                 int dtype_is_object)\n{\n    __Pyx_RefNannyDeclarations\n    int i;\n    __Pyx_memviewslice new_mvs = { 0, 0, { 0 }, { 0 }, { 0 } };\n    struct __pyx_memoryview_obj *from_memview = from_mvs->memview;\n    Py_buffer *buf = &from_memview->view;\n    PyObject *shape_tuple = NULL;\n    PyObject *temp_int = NULL;\n    struct __pyx_array_obj *array_obj = NULL;\n    struct __pyx_memoryview_obj *memview_obj = NULL;\n    __Pyx_RefNannySetupContext(\"__pyx_memoryview_copy_new_contig\", 0);\n    for (i = 0; i < ndim; i++) {\n        if (unlikely(from_mvs->suboffsets[i] >= 0)) {\n            PyErr_Format(PyExc_ValueError, \"Cannot copy memoryview slice with \"\n                                           \"indirect dimensions (axis %d)\", i);\n            goto fail;\n        }\n    }\n    shape_tuple = PyTuple_New(ndim);\n    if (unlikely(!shape_tuple)) {\n        goto fail;\n    }\n    __Pyx_GOTREF(shape_tuple);\n    for(i = 0; i < ndim; i++) {\n        temp_int = PyInt_FromSsize_t(from_mvs->shape[i]);\n        if(unlikely(!temp_int)) {\n            goto fail;\n        } else {\n            PyTuple_SET_ITEM(shape_tuple, i, temp_int);\n            temp_int = NULL;\n        }\n    }\n    array_obj = __pyx_array_new(shape_tuple, sizeof_dtype, buf->format, (char *) mode, NULL);\n    if (unlikely(!array_obj)) {\n        goto fail;\n    }\n    __Pyx_GOTREF(array_obj);\n    memview_obj = (struct __pyx_memoryview_obj *) __pyx_memoryview_new(\n                                    (PyObject *) array_obj, contig_flag,\n                                    dtype_is_object,\n                                    from_mvs->memview->typeinfo);\n    if (unlikely(!memview_obj))\n        goto fail;\n    if (unlikely(__Pyx_init_memviewslice(memview_obj, ndim, &new_mvs, 1) < 0))\n        goto fail;\n    if (unlikely(__pyx_memoryview_copy_contents(*from_mvs, new_mvs, ndim, ndim,\n                                                dtype_is_object) < 0))\n        goto fail;\n    goto no_fail;\nfail:\n    __Pyx_XDECREF(new_mvs.memview);\n    new_mvs.memview = NULL;\n    new_mvs.data = NULL;\nno_fail:\n    __Pyx_XDECREF(shape_tuple);\n    __Pyx_XDECREF(temp_int);\n    __Pyx_XDECREF(array_obj);\n    __Pyx_RefNannyFinishContext();\n    return new_mvs;\n}\n\n/* CIntToPy */\n  static CYTHON_INLINE PyObject* __Pyx_PyInt_From_int(int value) {\n#ifdef __Pyx_HAS_GCC_DIAGNOSTIC\n#pragma GCC diagnostic push\n#pragma GCC diagnostic ignored \"-Wconversion\"\n#endif\n    const int neg_one = (int) -1, const_zero = (int) 0;\n#ifdef __Pyx_HAS_GCC_DIAGNOSTIC\n#pragma GCC diagnostic pop\n#endif\n    const int is_unsigned = neg_one > const_zero;\n    if (is_unsigned) {\n        if (sizeof(int) < sizeof(long)) {\n            return PyInt_FromLong((long) value);\n        } else if (sizeof(int) <= sizeof(unsigned long)) {\n            return PyLong_FromUnsignedLong((unsigned long) value);\n#ifdef HAVE_LONG_LONG\n        } else if (sizeof(int) <= sizeof(unsigned PY_LONG_LONG)) {\n            return PyLong_FromUnsignedLongLong((unsigned PY_LONG_LONG) value);\n#endif\n        }\n    } else {\n        if (sizeof(int) <= sizeof(long)) {\n            return PyInt_FromLong((long) value);\n#ifdef HAVE_LONG_LONG\n        } else if (sizeof(int) <= sizeof(PY_LONG_LONG)) {\n            return PyLong_FromLongLong((PY_LONG_LONG) value);\n#endif\n        }\n    }\n    {\n        int one = 1; int little = (int)*(unsigned char *)&one;\n        unsigned char *bytes = (unsigned char *)&value;\n        return _PyLong_FromByteArray(bytes, sizeof(int),\n                                     little, !is_unsigned);\n    }\n}\n\n/* CIntFromPy */\n  static CYTHON_INLINE int __Pyx_PyInt_As_int(PyObject *x) {\n#ifdef __Pyx_HAS_GCC_DIAGNOSTIC\n#pragma GCC diagnostic push\n#pragma GCC diagnostic ignored \"-Wconversion\"\n#endif\n    const int neg_one = (int) -1, const_zero = (int) 0;\n#ifdef __Pyx_HAS_GCC_DIAGNOSTIC\n#pragma GCC diagnostic pop\n#endif\n    const int is_unsigned = neg_one > const_zero;\n#if PY_MAJOR_VERSION < 3\n    if (likely(PyInt_Check(x))) {\n        if (sizeof(int) < sizeof(long)) {\n            __PYX_VERIFY_RETURN_INT(int, long, PyInt_AS_LONG(x))\n        } else {\n            long val = PyInt_AS_LONG(x);\n            if (is_unsigned && unlikely(val < 0)) {\n                goto raise_neg_overflow;\n            }\n            return (int) val;\n        }\n    } else\n#endif\n    if (likely(PyLong_Check(x))) {\n        if (is_unsigned) {\n#if CYTHON_USE_PYLONG_INTERNALS\n            const digit* digits = ((PyLongObject*)x)->ob_digit;\n            switch (Py_SIZE(x)) {\n                case  0: return (int) 0;\n                case  1: __PYX_VERIFY_RETURN_INT(int, digit, digits[0])\n                case 2:\n                    if (8 * sizeof(int) > 1 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(int, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(int) >= 2 * PyLong_SHIFT) {\n                            return (int) (((((int)digits[1]) << PyLong_SHIFT) | (int)digits[0]));\n                        }\n                    }\n                    break;\n                case 3:\n                    if (8 * sizeof(int) > 2 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(int, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(int) >= 3 * PyLong_SHIFT) {\n                            return (int) (((((((int)digits[2]) << PyLong_SHIFT) | (int)digits[1]) << PyLong_SHIFT) | (int)digits[0]));\n                        }\n                    }\n                    break;\n                case 4:\n                    if (8 * sizeof(int) > 3 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(int, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(int) >= 4 * PyLong_SHIFT) {\n                            return (int) (((((((((int)digits[3]) << PyLong_SHIFT) | (int)digits[2]) << PyLong_SHIFT) | (int)digits[1]) << PyLong_SHIFT) | (int)digits[0]));\n                        }\n                    }\n                    break;\n            }\n#endif\n#if CYTHON_COMPILING_IN_CPYTHON\n            if (unlikely(Py_SIZE(x) < 0)) {\n                goto raise_neg_overflow;\n            }\n#else\n            {\n                int result = PyObject_RichCompareBool(x, Py_False, Py_LT);\n                if (unlikely(result < 0))\n                    return (int) -1;\n                if (unlikely(result == 1))\n                    goto raise_neg_overflow;\n            }\n#endif\n            if (sizeof(int) <= sizeof(unsigned long)) {\n                __PYX_VERIFY_RETURN_INT_EXC(int, unsigned long, PyLong_AsUnsignedLong(x))\n#ifdef HAVE_LONG_LONG\n            } else if (sizeof(int) <= sizeof(unsigned PY_LONG_LONG)) {\n                __PYX_VERIFY_RETURN_INT_EXC(int, unsigned PY_LONG_LONG, PyLong_AsUnsignedLongLong(x))\n#endif\n            }\n        } else {\n#if CYTHON_USE_PYLONG_INTERNALS\n            const digit* digits = ((PyLongObject*)x)->ob_digit;\n            switch (Py_SIZE(x)) {\n                case  0: return (int) 0;\n                case -1: __PYX_VERIFY_RETURN_INT(int, sdigit, (sdigit) (-(sdigit)digits[0]))\n                case  1: __PYX_VERIFY_RETURN_INT(int,  digit, +digits[0])\n                case -2:\n                    if (8 * sizeof(int) - 1 > 1 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(int, long, -(long) (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(int) - 1 > 2 * PyLong_SHIFT) {\n                            return (int) (((int)-1)*(((((int)digits[1]) << PyLong_SHIFT) | (int)digits[0])));\n                        }\n                    }\n                    break;\n                case 2:\n                    if (8 * sizeof(int) > 1 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(int, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(int) - 1 > 2 * PyLong_SHIFT) {\n                            return (int) ((((((int)digits[1]) << PyLong_SHIFT) | (int)digits[0])));\n                        }\n                    }\n                    break;\n                case -3:\n                    if (8 * sizeof(int) - 1 > 2 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(int, long, -(long) (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(int) - 1 > 3 * PyLong_SHIFT) {\n                            return (int) (((int)-1)*(((((((int)digits[2]) << PyLong_SHIFT) | (int)digits[1]) << PyLong_SHIFT) | (int)digits[0])));\n                        }\n                    }\n                    break;\n                case 3:\n                    if (8 * sizeof(int) > 2 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(int, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(int) - 1 > 3 * PyLong_SHIFT) {\n                            return (int) ((((((((int)digits[2]) << PyLong_SHIFT) | (int)digits[1]) << PyLong_SHIFT) | (int)digits[0])));\n                        }\n                    }\n                    break;\n                case -4:\n                    if (8 * sizeof(int) - 1 > 3 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(int, long, -(long) (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(int) - 1 > 4 * PyLong_SHIFT) {\n                            return (int) (((int)-1)*(((((((((int)digits[3]) << PyLong_SHIFT) | (int)digits[2]) << PyLong_SHIFT) | (int)digits[1]) << PyLong_SHIFT) | (int)digits[0])));\n                        }\n                    }\n                    break;\n                case 4:\n                    if (8 * sizeof(int) > 3 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(int, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(int) - 1 > 4 * PyLong_SHIFT) {\n                            return (int) ((((((((((int)digits[3]) << PyLong_SHIFT) | (int)digits[2]) << PyLong_SHIFT) | (int)digits[1]) << PyLong_SHIFT) | (int)digits[0])));\n                        }\n                    }\n                    break;\n            }\n#endif\n            if (sizeof(int) <= sizeof(long)) {\n                __PYX_VERIFY_RETURN_INT_EXC(int, long, PyLong_AsLong(x))\n#ifdef HAVE_LONG_LONG\n            } else if (sizeof(int) <= sizeof(PY_LONG_LONG)) {\n                __PYX_VERIFY_RETURN_INT_EXC(int, PY_LONG_LONG, PyLong_AsLongLong(x))\n#endif\n            }\n        }\n        {\n#if CYTHON_COMPILING_IN_PYPY && !defined(_PyLong_AsByteArray)\n            PyErr_SetString(PyExc_RuntimeError,\n                            \"_PyLong_AsByteArray() not available in PyPy, cannot convert large numbers\");\n#else\n            int val;\n            PyObject *v = __Pyx_PyNumber_IntOrLong(x);\n #if PY_MAJOR_VERSION < 3\n            if (likely(v) && !PyLong_Check(v)) {\n                PyObject *tmp = v;\n                v = PyNumber_Long(tmp);\n                Py_DECREF(tmp);\n            }\n #endif\n            if (likely(v)) {\n                int one = 1; int is_little = (int)*(unsigned char *)&one;\n                unsigned char *bytes = (unsigned char *)&val;\n                int ret = _PyLong_AsByteArray((PyLongObject *)v,\n                                              bytes, sizeof(val),\n                                              is_little, !is_unsigned);\n                Py_DECREF(v);\n                if (likely(!ret))\n                    return val;\n            }\n#endif\n            return (int) -1;\n        }\n    } else {\n        int val;\n        PyObject *tmp = __Pyx_PyNumber_IntOrLong(x);\n        if (!tmp) return (int) -1;\n        val = __Pyx_PyInt_As_int(tmp);\n        Py_DECREF(tmp);\n        return val;\n    }\nraise_overflow:\n    PyErr_SetString(PyExc_OverflowError,\n        \"value too large to convert to int\");\n    return (int) -1;\nraise_neg_overflow:\n    PyErr_SetString(PyExc_OverflowError,\n        \"can't convert negative value to int\");\n    return (int) -1;\n}\n\n/* CIntToPy */\n  static CYTHON_INLINE PyObject* __Pyx_PyInt_From_long(long value) {\n#ifdef __Pyx_HAS_GCC_DIAGNOSTIC\n#pragma GCC diagnostic push\n#pragma GCC diagnostic ignored \"-Wconversion\"\n#endif\n    const long neg_one = (long) -1, const_zero = (long) 0;\n#ifdef __Pyx_HAS_GCC_DIAGNOSTIC\n#pragma GCC diagnostic pop\n#endif\n    const int is_unsigned = neg_one > const_zero;\n    if (is_unsigned) {\n        if (sizeof(long) < sizeof(long)) {\n            return PyInt_FromLong((long) value);\n        } else if (sizeof(long) <= sizeof(unsigned long)) {\n            return PyLong_FromUnsignedLong((unsigned long) value);\n#ifdef HAVE_LONG_LONG\n        } else if (sizeof(long) <= sizeof(unsigned PY_LONG_LONG)) {\n            return PyLong_FromUnsignedLongLong((unsigned PY_LONG_LONG) value);\n#endif\n        }\n    } else {\n        if (sizeof(long) <= sizeof(long)) {\n            return PyInt_FromLong((long) value);\n#ifdef HAVE_LONG_LONG\n        } else if (sizeof(long) <= sizeof(PY_LONG_LONG)) {\n            return PyLong_FromLongLong((PY_LONG_LONG) value);\n#endif\n        }\n    }\n    {\n        int one = 1; int little = (int)*(unsigned char *)&one;\n        unsigned char *bytes = (unsigned char *)&value;\n        return _PyLong_FromByteArray(bytes, sizeof(long),\n                                     little, !is_unsigned);\n    }\n}\n\n/* CIntFromPy */\n  static CYTHON_INLINE long __Pyx_PyInt_As_long(PyObject *x) {\n#ifdef __Pyx_HAS_GCC_DIAGNOSTIC\n#pragma GCC diagnostic push\n#pragma GCC diagnostic ignored \"-Wconversion\"\n#endif\n    const long neg_one = (long) -1, const_zero = (long) 0;\n#ifdef __Pyx_HAS_GCC_DIAGNOSTIC\n#pragma GCC diagnostic pop\n#endif\n    const int is_unsigned = neg_one > const_zero;\n#if PY_MAJOR_VERSION < 3\n    if (likely(PyInt_Check(x))) {\n        if (sizeof(long) < sizeof(long)) {\n            __PYX_VERIFY_RETURN_INT(long, long, PyInt_AS_LONG(x))\n        } else {\n            long val = PyInt_AS_LONG(x);\n            if (is_unsigned && unlikely(val < 0)) {\n                goto raise_neg_overflow;\n            }\n            return (long) val;\n        }\n    } else\n#endif\n    if (likely(PyLong_Check(x))) {\n        if (is_unsigned) {\n#if CYTHON_USE_PYLONG_INTERNALS\n            const digit* digits = ((PyLongObject*)x)->ob_digit;\n            switch (Py_SIZE(x)) {\n                case  0: return (long) 0;\n                case  1: __PYX_VERIFY_RETURN_INT(long, digit, digits[0])\n                case 2:\n                    if (8 * sizeof(long) > 1 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(long, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(long) >= 2 * PyLong_SHIFT) {\n                            return (long) (((((long)digits[1]) << PyLong_SHIFT) | (long)digits[0]));\n                        }\n                    }\n                    break;\n                case 3:\n                    if (8 * sizeof(long) > 2 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(long, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(long) >= 3 * PyLong_SHIFT) {\n                            return (long) (((((((long)digits[2]) << PyLong_SHIFT) | (long)digits[1]) << PyLong_SHIFT) | (long)digits[0]));\n                        }\n                    }\n                    break;\n                case 4:\n                    if (8 * sizeof(long) > 3 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(long, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(long) >= 4 * PyLong_SHIFT) {\n                            return (long) (((((((((long)digits[3]) << PyLong_SHIFT) | (long)digits[2]) << PyLong_SHIFT) | (long)digits[1]) << PyLong_SHIFT) | (long)digits[0]));\n                        }\n                    }\n                    break;\n            }\n#endif\n#if CYTHON_COMPILING_IN_CPYTHON\n            if (unlikely(Py_SIZE(x) < 0)) {\n                goto raise_neg_overflow;\n            }\n#else\n            {\n                int result = PyObject_RichCompareBool(x, Py_False, Py_LT);\n                if (unlikely(result < 0))\n                    return (long) -1;\n                if (unlikely(result == 1))\n                    goto raise_neg_overflow;\n            }\n#endif\n            if (sizeof(long) <= sizeof(unsigned long)) {\n                __PYX_VERIFY_RETURN_INT_EXC(long, unsigned long, PyLong_AsUnsignedLong(x))\n#ifdef HAVE_LONG_LONG\n            } else if (sizeof(long) <= sizeof(unsigned PY_LONG_LONG)) {\n                __PYX_VERIFY_RETURN_INT_EXC(long, unsigned PY_LONG_LONG, PyLong_AsUnsignedLongLong(x))\n#endif\n            }\n        } else {\n#if CYTHON_USE_PYLONG_INTERNALS\n            const digit* digits = ((PyLongObject*)x)->ob_digit;\n            switch (Py_SIZE(x)) {\n                case  0: return (long) 0;\n                case -1: __PYX_VERIFY_RETURN_INT(long, sdigit, (sdigit) (-(sdigit)digits[0]))\n                case  1: __PYX_VERIFY_RETURN_INT(long,  digit, +digits[0])\n                case -2:\n                    if (8 * sizeof(long) - 1 > 1 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(long, long, -(long) (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(long) - 1 > 2 * PyLong_SHIFT) {\n                            return (long) (((long)-1)*(((((long)digits[1]) << PyLong_SHIFT) | (long)digits[0])));\n                        }\n                    }\n                    break;\n                case 2:\n                    if (8 * sizeof(long) > 1 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(long, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(long) - 1 > 2 * PyLong_SHIFT) {\n                            return (long) ((((((long)digits[1]) << PyLong_SHIFT) | (long)digits[0])));\n                        }\n                    }\n                    break;\n                case -3:\n                    if (8 * sizeof(long) - 1 > 2 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(long, long, -(long) (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(long) - 1 > 3 * PyLong_SHIFT) {\n                            return (long) (((long)-1)*(((((((long)digits[2]) << PyLong_SHIFT) | (long)digits[1]) << PyLong_SHIFT) | (long)digits[0])));\n                        }\n                    }\n                    break;\n                case 3:\n                    if (8 * sizeof(long) > 2 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(long, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(long) - 1 > 3 * PyLong_SHIFT) {\n                            return (long) ((((((((long)digits[2]) << PyLong_SHIFT) | (long)digits[1]) << PyLong_SHIFT) | (long)digits[0])));\n                        }\n                    }\n                    break;\n                case -4:\n                    if (8 * sizeof(long) - 1 > 3 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(long, long, -(long) (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(long) - 1 > 4 * PyLong_SHIFT) {\n                            return (long) (((long)-1)*(((((((((long)digits[3]) << PyLong_SHIFT) | (long)digits[2]) << PyLong_SHIFT) | (long)digits[1]) << PyLong_SHIFT) | (long)digits[0])));\n                        }\n                    }\n                    break;\n                case 4:\n                    if (8 * sizeof(long) > 3 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(long, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(long) - 1 > 4 * PyLong_SHIFT) {\n                            return (long) ((((((((((long)digits[3]) << PyLong_SHIFT) | (long)digits[2]) << PyLong_SHIFT) | (long)digits[1]) << PyLong_SHIFT) | (long)digits[0])));\n                        }\n                    }\n                    break;\n            }\n#endif\n            if (sizeof(long) <= sizeof(long)) {\n                __PYX_VERIFY_RETURN_INT_EXC(long, long, PyLong_AsLong(x))\n#ifdef HAVE_LONG_LONG\n            } else if (sizeof(long) <= sizeof(PY_LONG_LONG)) {\n                __PYX_VERIFY_RETURN_INT_EXC(long, PY_LONG_LONG, PyLong_AsLongLong(x))\n#endif\n            }\n        }\n        {\n#if CYTHON_COMPILING_IN_PYPY && !defined(_PyLong_AsByteArray)\n            PyErr_SetString(PyExc_RuntimeError,\n                            \"_PyLong_AsByteArray() not available in PyPy, cannot convert large numbers\");\n#else\n            long val;\n            PyObject *v = __Pyx_PyNumber_IntOrLong(x);\n #if PY_MAJOR_VERSION < 3\n            if (likely(v) && !PyLong_Check(v)) {\n                PyObject *tmp = v;\n                v = PyNumber_Long(tmp);\n                Py_DECREF(tmp);\n            }\n #endif\n            if (likely(v)) {\n                int one = 1; int is_little = (int)*(unsigned char *)&one;\n                unsigned char *bytes = (unsigned char *)&val;\n                int ret = _PyLong_AsByteArray((PyLongObject *)v,\n                                              bytes, sizeof(val),\n                                              is_little, !is_unsigned);\n                Py_DECREF(v);\n                if (likely(!ret))\n                    return val;\n            }\n#endif\n            return (long) -1;\n        }\n    } else {\n        long val;\n        PyObject *tmp = __Pyx_PyNumber_IntOrLong(x);\n        if (!tmp) return (long) -1;\n        val = __Pyx_PyInt_As_long(tmp);\n        Py_DECREF(tmp);\n        return val;\n    }\nraise_overflow:\n    PyErr_SetString(PyExc_OverflowError,\n        \"value too large to convert to long\");\n    return (long) -1;\nraise_neg_overflow:\n    PyErr_SetString(PyExc_OverflowError,\n        \"can't convert negative value to long\");\n    return (long) -1;\n}\n\n/* CIntFromPy */\n  static CYTHON_INLINE char __Pyx_PyInt_As_char(PyObject *x) {\n#ifdef __Pyx_HAS_GCC_DIAGNOSTIC\n#pragma GCC diagnostic push\n#pragma GCC diagnostic ignored \"-Wconversion\"\n#endif\n    const char neg_one = (char) -1, const_zero = (char) 0;\n#ifdef __Pyx_HAS_GCC_DIAGNOSTIC\n#pragma GCC diagnostic pop\n#endif\n    const int is_unsigned = neg_one > const_zero;\n#if PY_MAJOR_VERSION < 3\n    if (likely(PyInt_Check(x))) {\n        if (sizeof(char) < sizeof(long)) {\n            __PYX_VERIFY_RETURN_INT(char, long, PyInt_AS_LONG(x))\n        } else {\n            long val = PyInt_AS_LONG(x);\n            if (is_unsigned && unlikely(val < 0)) {\n                goto raise_neg_overflow;\n            }\n            return (char) val;\n        }\n    } else\n#endif\n    if (likely(PyLong_Check(x))) {\n        if (is_unsigned) {\n#if CYTHON_USE_PYLONG_INTERNALS\n            const digit* digits = ((PyLongObject*)x)->ob_digit;\n            switch (Py_SIZE(x)) {\n                case  0: return (char) 0;\n                case  1: __PYX_VERIFY_RETURN_INT(char, digit, digits[0])\n                case 2:\n                    if (8 * sizeof(char) > 1 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(char, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(char) >= 2 * PyLong_SHIFT) {\n                            return (char) (((((char)digits[1]) << PyLong_SHIFT) | (char)digits[0]));\n                        }\n                    }\n                    break;\n                case 3:\n                    if (8 * sizeof(char) > 2 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(char, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(char) >= 3 * PyLong_SHIFT) {\n                            return (char) (((((((char)digits[2]) << PyLong_SHIFT) | (char)digits[1]) << PyLong_SHIFT) | (char)digits[0]));\n                        }\n                    }\n                    break;\n                case 4:\n                    if (8 * sizeof(char) > 3 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(char, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(char) >= 4 * PyLong_SHIFT) {\n                            return (char) (((((((((char)digits[3]) << PyLong_SHIFT) | (char)digits[2]) << PyLong_SHIFT) | (char)digits[1]) << PyLong_SHIFT) | (char)digits[0]));\n                        }\n                    }\n                    break;\n            }\n#endif\n#if CYTHON_COMPILING_IN_CPYTHON\n            if (unlikely(Py_SIZE(x) < 0)) {\n                goto raise_neg_overflow;\n            }\n#else\n            {\n                int result = PyObject_RichCompareBool(x, Py_False, Py_LT);\n                if (unlikely(result < 0))\n                    return (char) -1;\n                if (unlikely(result == 1))\n                    goto raise_neg_overflow;\n            }\n#endif\n            if (sizeof(char) <= sizeof(unsigned long)) {\n                __PYX_VERIFY_RETURN_INT_EXC(char, unsigned long, PyLong_AsUnsignedLong(x))\n#ifdef HAVE_LONG_LONG\n            } else if (sizeof(char) <= sizeof(unsigned PY_LONG_LONG)) {\n                __PYX_VERIFY_RETURN_INT_EXC(char, unsigned PY_LONG_LONG, PyLong_AsUnsignedLongLong(x))\n#endif\n            }\n        } else {\n#if CYTHON_USE_PYLONG_INTERNALS\n            const digit* digits = ((PyLongObject*)x)->ob_digit;\n            switch (Py_SIZE(x)) {\n                case  0: return (char) 0;\n                case -1: __PYX_VERIFY_RETURN_INT(char, sdigit, (sdigit) (-(sdigit)digits[0]))\n                case  1: __PYX_VERIFY_RETURN_INT(char,  digit, +digits[0])\n                case -2:\n                    if (8 * sizeof(char) - 1 > 1 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(char, long, -(long) (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(char) - 1 > 2 * PyLong_SHIFT) {\n                            return (char) (((char)-1)*(((((char)digits[1]) << PyLong_SHIFT) | (char)digits[0])));\n                        }\n                    }\n                    break;\n                case 2:\n                    if (8 * sizeof(char) > 1 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(char, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(char) - 1 > 2 * PyLong_SHIFT) {\n                            return (char) ((((((char)digits[1]) << PyLong_SHIFT) | (char)digits[0])));\n                        }\n                    }\n                    break;\n                case -3:\n                    if (8 * sizeof(char) - 1 > 2 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(char, long, -(long) (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(char) - 1 > 3 * PyLong_SHIFT) {\n                            return (char) (((char)-1)*(((((((char)digits[2]) << PyLong_SHIFT) | (char)digits[1]) << PyLong_SHIFT) | (char)digits[0])));\n                        }\n                    }\n                    break;\n                case 3:\n                    if (8 * sizeof(char) > 2 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(char, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(char) - 1 > 3 * PyLong_SHIFT) {\n                            return (char) ((((((((char)digits[2]) << PyLong_SHIFT) | (char)digits[1]) << PyLong_SHIFT) | (char)digits[0])));\n                        }\n                    }\n                    break;\n                case -4:\n                    if (8 * sizeof(char) - 1 > 3 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(char, long, -(long) (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(char) - 1 > 4 * PyLong_SHIFT) {\n                            return (char) (((char)-1)*(((((((((char)digits[3]) << PyLong_SHIFT) | (char)digits[2]) << PyLong_SHIFT) | (char)digits[1]) << PyLong_SHIFT) | (char)digits[0])));\n                        }\n                    }\n                    break;\n                case 4:\n                    if (8 * sizeof(char) > 3 * PyLong_SHIFT) {\n                        if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) {\n                            __PYX_VERIFY_RETURN_INT(char, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0])))\n                        } else if (8 * sizeof(char) - 1 > 4 * PyLong_SHIFT) {\n                            return (char) ((((((((((char)digits[3]) << PyLong_SHIFT) | (char)digits[2]) << PyLong_SHIFT) | (char)digits[1]) << PyLong_SHIFT) | (char)digits[0])));\n                        }\n                    }\n                    break;\n            }\n#endif\n            if (sizeof(char) <= sizeof(long)) {\n                __PYX_VERIFY_RETURN_INT_EXC(char, long, PyLong_AsLong(x))\n#ifdef HAVE_LONG_LONG\n            } else if (sizeof(char) <= sizeof(PY_LONG_LONG)) {\n                __PYX_VERIFY_RETURN_INT_EXC(char, PY_LONG_LONG, PyLong_AsLongLong(x))\n#endif\n            }\n        }\n        {\n#if CYTHON_COMPILING_IN_PYPY && !defined(_PyLong_AsByteArray)\n            PyErr_SetString(PyExc_RuntimeError,\n                            \"_PyLong_AsByteArray() not available in PyPy, cannot convert large numbers\");\n#else\n            char val;\n            PyObject *v = __Pyx_PyNumber_IntOrLong(x);\n #if PY_MAJOR_VERSION < 3\n            if (likely(v) && !PyLong_Check(v)) {\n                PyObject *tmp = v;\n                v = PyNumber_Long(tmp);\n                Py_DECREF(tmp);\n            }\n #endif\n            if (likely(v)) {\n                int one = 1; int is_little = (int)*(unsigned char *)&one;\n                unsigned char *bytes = (unsigned char *)&val;\n                int ret = _PyLong_AsByteArray((PyLongObject *)v,\n                                              bytes, sizeof(val),\n                                              is_little, !is_unsigned);\n                Py_DECREF(v);\n                if (likely(!ret))\n                    return val;\n            }\n#endif\n            return (char) -1;\n        }\n    } else {\n        char val;\n        PyObject *tmp = __Pyx_PyNumber_IntOrLong(x);\n        if (!tmp) return (char) -1;\n        val = __Pyx_PyInt_As_char(tmp);\n        Py_DECREF(tmp);\n        return val;\n    }\nraise_overflow:\n    PyErr_SetString(PyExc_OverflowError,\n        \"value too large to convert to char\");\n    return (char) -1;\nraise_neg_overflow:\n    PyErr_SetString(PyExc_OverflowError,\n        \"can't convert negative value to char\");\n    return (char) -1;\n}\n\n/* CheckBinaryVersion */\n  static int __Pyx_check_binary_version(void) {\n    char ctversion[4], rtversion[4];\n    PyOS_snprintf(ctversion, 4, \"%d.%d\", PY_MAJOR_VERSION, PY_MINOR_VERSION);\n    PyOS_snprintf(rtversion, 4, \"%s\", Py_GetVersion());\n    if (ctversion[0] != rtversion[0] || ctversion[2] != rtversion[2]) {\n        char message[200];\n        PyOS_snprintf(message, sizeof(message),\n                      \"compiletime version %s of module '%.100s' \"\n                      \"does not match runtime version %s\",\n                      ctversion, __Pyx_MODULE_NAME, rtversion);\n        return PyErr_WarnEx(NULL, message, 1);\n    }\n    return 0;\n}\n\n/* InitStrings */\n  static int __Pyx_InitStrings(__Pyx_StringTabEntry *t) {\n    while (t->p) {\n        #if PY_MAJOR_VERSION < 3\n        if (t->is_unicode) {\n            *t->p = PyUnicode_DecodeUTF8(t->s, t->n - 1, NULL);\n        } else if (t->intern) {\n            *t->p = PyString_InternFromString(t->s);\n        } else {\n            *t->p = PyString_FromStringAndSize(t->s, t->n - 1);\n        }\n        #else\n        if (t->is_unicode | t->is_str) {\n            if (t->intern) {\n                *t->p = PyUnicode_InternFromString(t->s);\n            } else if (t->encoding) {\n                *t->p = PyUnicode_Decode(t->s, t->n - 1, t->encoding, NULL);\n            } else {\n                *t->p = PyUnicode_FromStringAndSize(t->s, t->n - 1);\n            }\n        } else {\n            *t->p = PyBytes_FromStringAndSize(t->s, t->n - 1);\n        }\n        #endif\n        if (!*t->p)\n            return -1;\n        if (PyObject_Hash(*t->p) == -1)\n            return -1;\n        ++t;\n    }\n    return 0;\n}\n\nstatic CYTHON_INLINE PyObject* __Pyx_PyUnicode_FromString(const char* c_str) {\n    return __Pyx_PyUnicode_FromStringAndSize(c_str, (Py_ssize_t)strlen(c_str));\n}\nstatic CYTHON_INLINE const char* __Pyx_PyObject_AsString(PyObject* o) {\n    Py_ssize_t ignore;\n    return __Pyx_PyObject_AsStringAndSize(o, &ignore);\n}\n#if __PYX_DEFAULT_STRING_ENCODING_IS_ASCII || __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT\n#if !CYTHON_PEP393_ENABLED\nstatic const char* __Pyx_PyUnicode_AsStringAndSize(PyObject* o, Py_ssize_t *length) {\n    char* defenc_c;\n    PyObject* defenc = _PyUnicode_AsDefaultEncodedString(o, NULL);\n    if (!defenc) return NULL;\n    defenc_c = PyBytes_AS_STRING(defenc);\n#if __PYX_DEFAULT_STRING_ENCODING_IS_ASCII\n    {\n        char* end = defenc_c + PyBytes_GET_SIZE(defenc);\n        char* c;\n        for (c = defenc_c; c < end; c++) {\n            if ((unsigned char) (*c) >= 128) {\n                PyUnicode_AsASCIIString(o);\n                return NULL;\n            }\n        }\n    }\n#endif\n    *length = PyBytes_GET_SIZE(defenc);\n    return defenc_c;\n}\n#else\nstatic CYTHON_INLINE const char* __Pyx_PyUnicode_AsStringAndSize(PyObject* o, Py_ssize_t *length) {\n    if (unlikely(__Pyx_PyUnicode_READY(o) == -1)) return NULL;\n#if __PYX_DEFAULT_STRING_ENCODING_IS_ASCII\n    if (likely(PyUnicode_IS_ASCII(o))) {\n        *length = PyUnicode_GET_LENGTH(o);\n        return PyUnicode_AsUTF8(o);\n    } else {\n        PyUnicode_AsASCIIString(o);\n        return NULL;\n    }\n#else\n    return PyUnicode_AsUTF8AndSize(o, length);\n#endif\n}\n#endif\n#endif\nstatic CYTHON_INLINE const char* __Pyx_PyObject_AsStringAndSize(PyObject* o, Py_ssize_t *length) {\n#if __PYX_DEFAULT_STRING_ENCODING_IS_ASCII || __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT\n    if (\n#if PY_MAJOR_VERSION < 3 && __PYX_DEFAULT_STRING_ENCODING_IS_ASCII\n            __Pyx_sys_getdefaultencoding_not_ascii &&\n#endif\n            PyUnicode_Check(o)) {\n        return __Pyx_PyUnicode_AsStringAndSize(o, length);\n    } else\n#endif\n#if (!CYTHON_COMPILING_IN_PYPY) || (defined(PyByteArray_AS_STRING) && defined(PyByteArray_GET_SIZE))\n    if (PyByteArray_Check(o)) {\n        *length = PyByteArray_GET_SIZE(o);\n        return PyByteArray_AS_STRING(o);\n    } else\n#endif\n    {\n        char* result;\n        int r = PyBytes_AsStringAndSize(o, &result, length);\n        if (unlikely(r < 0)) {\n            return NULL;\n        } else {\n            return result;\n        }\n    }\n}\nstatic CYTHON_INLINE int __Pyx_PyObject_IsTrue(PyObject* x) {\n   int is_true = x == Py_True;\n   if (is_true | (x == Py_False) | (x == Py_None)) return is_true;\n   else return PyObject_IsTrue(x);\n}\nstatic CYTHON_INLINE int __Pyx_PyObject_IsTrueAndDecref(PyObject* x) {\n    int retval;\n    if (unlikely(!x)) return -1;\n    retval = __Pyx_PyObject_IsTrue(x);\n    Py_DECREF(x);\n    return retval;\n}\nstatic PyObject* __Pyx_PyNumber_IntOrLongWrongResultType(PyObject* result, const char* type_name) {\n#if PY_MAJOR_VERSION >= 3\n    if (PyLong_Check(result)) {\n        if (PyErr_WarnFormat(PyExc_DeprecationWarning, 1,\n                \"__int__ returned non-int (type %.200s).  \"\n                \"The ability to return an instance of a strict subclass of int \"\n                \"is deprecated, and may be removed in a future version of Python.\",\n                Py_TYPE(result)->tp_name)) {\n            Py_DECREF(result);\n            return NULL;\n        }\n        return result;\n    }\n#endif\n    PyErr_Format(PyExc_TypeError,\n                 \"__%.4s__ returned non-%.4s (type %.200s)\",\n                 type_name, type_name, Py_TYPE(result)->tp_name);\n    Py_DECREF(result);\n    return NULL;\n}\nstatic CYTHON_INLINE PyObject* __Pyx_PyNumber_IntOrLong(PyObject* x) {\n#if CYTHON_USE_TYPE_SLOTS\n  PyNumberMethods *m;\n#endif\n  const char *name = NULL;\n  PyObject *res = NULL;\n#if PY_MAJOR_VERSION < 3\n  if (likely(PyInt_Check(x) || PyLong_Check(x)))\n#else\n  if (likely(PyLong_Check(x)))\n#endif\n    return __Pyx_NewRef(x);\n#if CYTHON_USE_TYPE_SLOTS\n  m = Py_TYPE(x)->tp_as_number;\n  #if PY_MAJOR_VERSION < 3\n  if (m && m->nb_int) {\n    name = \"int\";\n    res = m->nb_int(x);\n  }\n  else if (m && m->nb_long) {\n    name = \"long\";\n    res = m->nb_long(x);\n  }\n  #else\n  if (likely(m && m->nb_int)) {\n    name = \"int\";\n    res = m->nb_int(x);\n  }\n  #endif\n#else\n  if (!PyBytes_CheckExact(x) && !PyUnicode_CheckExact(x)) {\n    res = PyNumber_Int(x);\n  }\n#endif\n  if (likely(res)) {\n#if PY_MAJOR_VERSION < 3\n    if (unlikely(!PyInt_Check(res) && !PyLong_Check(res))) {\n#else\n    if (unlikely(!PyLong_CheckExact(res))) {\n#endif\n        return __Pyx_PyNumber_IntOrLongWrongResultType(res, name);\n    }\n  }\n  else if (!PyErr_Occurred()) {\n    PyErr_SetString(PyExc_TypeError,\n                    \"an integer is required\");\n  }\n  return res;\n}\nstatic CYTHON_INLINE Py_ssize_t __Pyx_PyIndex_AsSsize_t(PyObject* b) {\n  Py_ssize_t ival;\n  PyObject *x;\n#if PY_MAJOR_VERSION < 3\n  if (likely(PyInt_CheckExact(b))) {\n    if (sizeof(Py_ssize_t) >= sizeof(long))\n        return PyInt_AS_LONG(b);\n    else\n        return PyInt_AsSsize_t(b);\n  }\n#endif\n  if (likely(PyLong_CheckExact(b))) {\n    #if CYTHON_USE_PYLONG_INTERNALS\n    const digit* digits = ((PyLongObject*)b)->ob_digit;\n    const Py_ssize_t size = Py_SIZE(b);\n    if (likely(__Pyx_sst_abs(size) <= 1)) {\n        ival = likely(size) ? digits[0] : 0;\n        if (size == -1) ival = -ival;\n        return ival;\n    } else {\n      switch (size) {\n         case 2:\n           if (8 * sizeof(Py_ssize_t) > 2 * PyLong_SHIFT) {\n             return (Py_ssize_t) (((((size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0]));\n           }\n           break;\n         case -2:\n           if (8 * sizeof(Py_ssize_t) > 2 * PyLong_SHIFT) {\n             return -(Py_ssize_t) (((((size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0]));\n           }\n           break;\n         case 3:\n           if (8 * sizeof(Py_ssize_t) > 3 * PyLong_SHIFT) {\n             return (Py_ssize_t) (((((((size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0]));\n           }\n           break;\n         case -3:\n           if (8 * sizeof(Py_ssize_t) > 3 * PyLong_SHIFT) {\n             return -(Py_ssize_t) (((((((size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0]));\n           }\n           break;\n         case 4:\n           if (8 * sizeof(Py_ssize_t) > 4 * PyLong_SHIFT) {\n             return (Py_ssize_t) (((((((((size_t)digits[3]) << PyLong_SHIFT) | (size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0]));\n           }\n           break;\n         case -4:\n           if (8 * sizeof(Py_ssize_t) > 4 * PyLong_SHIFT) {\n             return -(Py_ssize_t) (((((((((size_t)digits[3]) << PyLong_SHIFT) | (size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0]));\n           }\n           break;\n      }\n    }\n    #endif\n    return PyLong_AsSsize_t(b);\n  }\n  x = PyNumber_Index(b);\n  if (!x) return -1;\n  ival = PyInt_AsSsize_t(x);\n  Py_DECREF(x);\n  return ival;\n}\nstatic CYTHON_INLINE Py_hash_t __Pyx_PyIndex_AsHash_t(PyObject* o) {\n  if (sizeof(Py_hash_t) == sizeof(Py_ssize_t)) {\n    return (Py_hash_t) __Pyx_PyIndex_AsSsize_t(o);\n#if PY_MAJOR_VERSION < 3\n  } else if (likely(PyInt_CheckExact(o))) {\n    return PyInt_AS_LONG(o);\n#endif\n  } else {\n    Py_ssize_t ival;\n    PyObject *x;\n    x = PyNumber_Index(o);\n    if (!x) return -1;\n    ival = PyInt_AsLong(x);\n    Py_DECREF(x);\n    return ival;\n  }\n}\nstatic CYTHON_INLINE PyObject * __Pyx_PyBool_FromLong(long b) {\n  return b ? __Pyx_NewRef(Py_True) : __Pyx_NewRef(Py_False);\n}\nstatic CYTHON_INLINE PyObject * __Pyx_PyInt_FromSize_t(size_t ival) {\n    return PyInt_FromSize_t(ival);\n}\n\n\n#endif /* Py_PYTHON_H */\n"
  },
  {
    "path": "TTS/tts/utils/monotonic_align/core.pyx",
    "content": "import numpy as np\n\ncimport cython\ncimport numpy as np\n\nfrom cython.parallel import prange\n\n\n@cython.boundscheck(False)\n@cython.wraparound(False)\ncdef void maximum_path_each(int[:,::1] path, float[:,::1] value, int t_x, int t_y, float max_neg_val) nogil:\n  cdef int x\n  cdef int y\n  cdef float v_prev\n  cdef float v_cur\n  cdef float tmp\n  cdef int index = t_x - 1\n\n  for y in range(t_y):\n    for x in range(max(0, t_x + y - t_y), min(t_x, y + 1)):\n      if x == y:\n        v_cur = max_neg_val\n      else:\n        v_cur = value[x, y-1]\n      if x == 0:\n        if y == 0:\n          v_prev = 0.\n        else:\n          v_prev = max_neg_val\n      else:\n        v_prev = value[x-1, y-1]\n      value[x, y] = max(v_cur, v_prev) + value[x, y]\n\n  for y in range(t_y - 1, -1, -1):\n    path[index, y] = 1\n    if index != 0 and (index == y or value[index, y-1] < value[index-1, y-1]):\n      index = index - 1\n\n\n@cython.boundscheck(False)\n@cython.wraparound(False)\ncpdef void maximum_path_c(int[:,:,::1] paths, float[:,:,::1] values, int[::1] t_xs, int[::1] t_ys, float max_neg_val=-1e9) nogil:\n  cdef int b = values.shape[0]\n\n  cdef int i\n  for i in prange(b, nogil=True):\n    maximum_path_each(paths[i], values[i], t_xs[i], t_ys[i], max_neg_val)\n"
  },
  {
    "path": "TTS/tts/utils/monotonic_align/setup.py",
    "content": "# from distutils.core import setup\n# from Cython.Build import cythonize\n# import numpy\n\n# setup(name='monotonic_align',\n#       ext_modules=cythonize(\"core.pyx\"),\n#       include_dirs=[numpy.get_include()])\n"
  },
  {
    "path": "TTS/tts/utils/speakers.py",
    "content": "import json\nimport os\nfrom typing import Any, Dict, List, Union\n\nimport fsspec\nimport numpy as np\nimport torch\nfrom coqpit import Coqpit\n\nfrom TTS.config import get_from_config_or_model_args_with_default\nfrom TTS.tts.utils.managers import EmbeddingManager\n\n\nclass SpeakerManager(EmbeddingManager):\n    \"\"\"Manage the speakers for multi-speaker 🐸TTS models. Load a datafile and parse the information\n    in a way that can be queried by speaker or clip.\n\n    There are 3 different scenarios considered:\n\n    1. Models using speaker embedding layers. The datafile only maps speaker names to ids used by the embedding layer.\n    2. Models using d-vectors. The datafile includes a dictionary in the following format.\n\n    ::\n\n        {\n            'clip_name.wav':{\n                'name': 'speakerA',\n                'embedding'[<d_vector_values>]\n            },\n            ...\n        }\n\n\n    3. Computing the d-vectors by the speaker encoder. It loads the speaker encoder model and\n    computes the d-vectors for a given clip or speaker.\n\n    Args:\n        d_vectors_file_path (str, optional): Path to the metafile including x vectors. Defaults to \"\".\n        speaker_id_file_path (str, optional): Path to the metafile that maps speaker names to ids used by\n        TTS models. Defaults to \"\".\n        encoder_model_path (str, optional): Path to the speaker encoder model file. Defaults to \"\".\n        encoder_config_path (str, optional): Path to the spealer encoder config file. Defaults to \"\".\n\n    Examples:\n        >>> # load audio processor and speaker encoder\n        >>> ap = AudioProcessor(**config.audio)\n        >>> manager = SpeakerManager(encoder_model_path=encoder_model_path, encoder_config_path=encoder_config_path)\n        >>> # load a sample audio and compute embedding\n        >>> waveform = ap.load_wav(sample_wav_path)\n        >>> mel = ap.melspectrogram(waveform)\n        >>> d_vector = manager.compute_embeddings(mel.T)\n    \"\"\"\n\n    def __init__(\n        self,\n        data_items: List[List[Any]] = None,\n        d_vectors_file_path: str = \"\",\n        speaker_id_file_path: str = \"\",\n        encoder_model_path: str = \"\",\n        encoder_config_path: str = \"\",\n        use_cuda: bool = False,\n    ):\n        super().__init__(\n            embedding_file_path=d_vectors_file_path,\n            id_file_path=speaker_id_file_path,\n            encoder_model_path=encoder_model_path,\n            encoder_config_path=encoder_config_path,\n            use_cuda=use_cuda,\n        )\n\n        if data_items:\n            self.set_ids_from_data(data_items, parse_key=\"speaker_name\")\n\n    @property\n    def num_speakers(self):\n        return len(self.name_to_id)\n\n    @property\n    def speaker_names(self):\n        return list(self.name_to_id.keys())\n\n    def get_speakers(self) -> List:\n        return self.name_to_id\n\n    @staticmethod\n    def init_from_config(config: \"Coqpit\", samples: Union[List[List], List[Dict]] = None) -> \"SpeakerManager\":\n        \"\"\"Initialize a speaker manager from config\n\n        Args:\n            config (Coqpit): Config object.\n            samples (Union[List[List], List[Dict]], optional): List of data samples to parse out the speaker names.\n                Defaults to None.\n\n        Returns:\n            SpeakerEncoder: Speaker encoder object.\n        \"\"\"\n        speaker_manager = None\n        if get_from_config_or_model_args_with_default(config, \"use_speaker_embedding\", False):\n            if samples:\n                speaker_manager = SpeakerManager(data_items=samples)\n            if get_from_config_or_model_args_with_default(config, \"speaker_file\", None):\n                speaker_manager = SpeakerManager(\n                    speaker_id_file_path=get_from_config_or_model_args_with_default(config, \"speaker_file\", None)\n                )\n            if get_from_config_or_model_args_with_default(config, \"speakers_file\", None):\n                speaker_manager = SpeakerManager(\n                    speaker_id_file_path=get_from_config_or_model_args_with_default(config, \"speakers_file\", None)\n                )\n\n        if get_from_config_or_model_args_with_default(config, \"use_d_vector_file\", False):\n            speaker_manager = SpeakerManager()\n            if get_from_config_or_model_args_with_default(config, \"d_vector_file\", None):\n                speaker_manager = SpeakerManager(\n                    d_vectors_file_path=get_from_config_or_model_args_with_default(config, \"d_vector_file\", None)\n                )\n        return speaker_manager\n\n\ndef _set_file_path(path):\n    \"\"\"Find the speakers.json under the given path or the above it.\n    Intended to band aid the different paths returned in restored and continued training.\"\"\"\n    path_restore = os.path.join(os.path.dirname(path), \"speakers.json\")\n    path_continue = os.path.join(path, \"speakers.json\")\n    fs = fsspec.get_mapper(path).fs\n    if fs.exists(path_restore):\n        return path_restore\n    if fs.exists(path_continue):\n        return path_continue\n    raise FileNotFoundError(f\" [!] `speakers.json` not found in {path}\")\n\n\ndef load_speaker_mapping(out_path):\n    \"\"\"Loads speaker mapping if already present.\"\"\"\n    if os.path.splitext(out_path)[1] == \".json\":\n        json_file = out_path\n    else:\n        json_file = _set_file_path(out_path)\n    with fsspec.open(json_file, \"r\") as f:\n        return json.load(f)\n\n\ndef save_speaker_mapping(out_path, speaker_mapping):\n    \"\"\"Saves speaker mapping if not yet present.\"\"\"\n    if out_path is not None:\n        speakers_json_path = _set_file_path(out_path)\n        with fsspec.open(speakers_json_path, \"w\") as f:\n            json.dump(speaker_mapping, f, indent=4)\n\n\ndef get_speaker_manager(c: Coqpit, data: List = None, restore_path: str = None, out_path: str = None) -> SpeakerManager:\n    \"\"\"Initiate a `SpeakerManager` instance by the provided config.\n\n    Args:\n        c (Coqpit): Model configuration.\n        restore_path (str): Path to a previous training folder.\n        data (List): Data samples used in training to infer speakers from. It must be provided if speaker embedding\n            layers is used. Defaults to None.\n        out_path (str, optional): Save the generated speaker IDs to a output path. Defaults to None.\n\n    Returns:\n        SpeakerManager: initialized and ready to use instance.\n    \"\"\"\n    speaker_manager = SpeakerManager()\n    if c.use_speaker_embedding:\n        if data is not None:\n            speaker_manager.set_ids_from_data(data, parse_key=\"speaker_name\")\n        if restore_path:\n            speakers_file = _set_file_path(restore_path)\n            # restoring speaker manager from a previous run.\n            if c.use_d_vector_file:\n                # restore speaker manager with the embedding file\n                if not os.path.exists(speakers_file):\n                    print(\"WARNING: speakers.json was not found in restore_path, trying to use CONFIG.d_vector_file\")\n                    if not os.path.exists(c.d_vector_file):\n                        raise RuntimeError(\n                            \"You must copy the file speakers.json to restore_path, or set a valid file in CONFIG.d_vector_file\"\n                        )\n                    speaker_manager.load_embeddings_from_file(c.d_vector_file)\n                speaker_manager.load_embeddings_from_file(speakers_file)\n            elif not c.use_d_vector_file:  # restor speaker manager with speaker ID file.\n                speaker_ids_from_data = speaker_manager.name_to_id\n                speaker_manager.load_ids_from_file(speakers_file)\n                assert all(\n                    speaker in speaker_manager.name_to_id for speaker in speaker_ids_from_data\n                ), \" [!] You cannot introduce new speakers to a pre-trained model.\"\n        elif c.use_d_vector_file and c.d_vector_file:\n            # new speaker manager with external speaker embeddings.\n            speaker_manager.load_embeddings_from_file(c.d_vector_file)\n        elif c.use_d_vector_file and not c.d_vector_file:\n            raise \"use_d_vector_file is True, so you need pass a external speaker embedding file.\"\n        elif c.use_speaker_embedding and \"speakers_file\" in c and c.speakers_file:\n            # new speaker manager with speaker IDs file.\n            speaker_manager.load_ids_from_file(c.speakers_file)\n\n        if speaker_manager.num_speakers > 0:\n            print(\n                \" > Speaker manager is loaded with {} speakers: {}\".format(\n                    speaker_manager.num_speakers, \", \".join(speaker_manager.name_to_id)\n                )\n            )\n\n        # save file if path is defined\n        if out_path:\n            out_file_path = os.path.join(out_path, \"speakers.json\")\n            print(f\" > Saving `speakers.json` to {out_file_path}.\")\n            if c.use_d_vector_file and c.d_vector_file:\n                speaker_manager.save_embeddings_to_file(out_file_path)\n            else:\n                speaker_manager.save_ids_to_file(out_file_path)\n    return speaker_manager\n\n\ndef get_speaker_balancer_weights(items: list):\n    speaker_names = np.array([item[\"speaker_name\"] for item in items])\n    unique_speaker_names = np.unique(speaker_names).tolist()\n    speaker_ids = [unique_speaker_names.index(l) for l in speaker_names]\n    speaker_count = np.array([len(np.where(speaker_names == l)[0]) for l in unique_speaker_names])\n    weight_speaker = 1.0 / speaker_count\n    dataset_samples_weight = np.array([weight_speaker[l] for l in speaker_ids])\n    # normalize\n    dataset_samples_weight = dataset_samples_weight / np.linalg.norm(dataset_samples_weight)\n    return torch.from_numpy(dataset_samples_weight).float()\n"
  },
  {
    "path": "TTS/tts/utils/ssim.py",
    "content": "# Adopted from https://github.com/photosynthesis-team/piq\n\nfrom typing import List, Optional, Tuple, Union\n\nimport torch\nimport torch.nn.functional as F\nfrom torch.nn.modules.loss import _Loss\n\n\ndef _reduce(x: torch.Tensor, reduction: str = \"mean\") -> torch.Tensor:\n    r\"\"\"Reduce input in batch dimension if needed.\n    Args:\n        x: Tensor with shape (N, *).\n        reduction: Specifies the reduction type:\n            ``'none'`` | ``'mean'`` | ``'sum'``. Default: ``'mean'``\n    \"\"\"\n    if reduction == \"none\":\n        return x\n    if reduction == \"mean\":\n        return x.mean(dim=0)\n    if reduction == \"sum\":\n        return x.sum(dim=0)\n    raise ValueError(\"Unknown reduction. Expected one of {'none', 'mean', 'sum'}\")\n\n\ndef _validate_input(\n    tensors: List[torch.Tensor],\n    dim_range: Tuple[int, int] = (0, -1),\n    data_range: Tuple[float, float] = (0.0, -1.0),\n    # size_dim_range: Tuple[float, float] = (0., -1.),\n    size_range: Optional[Tuple[int, int]] = None,\n) -> None:\n    r\"\"\"Check that input(-s)  satisfies the requirements\n    Args:\n        tensors: Tensors to check\n        dim_range: Allowed number of dimensions. (min, max)\n        data_range: Allowed range of values in tensors. (min, max)\n        size_range: Dimensions to include in size comparison. (start_dim, end_dim + 1)\n    \"\"\"\n\n    if not __debug__:\n        return\n\n    x = tensors[0]\n\n    for t in tensors:\n        assert torch.is_tensor(t), f\"Expected torch.Tensor, got {type(t)}\"\n        assert t.device == x.device, f\"Expected tensors to be on {x.device}, got {t.device}\"\n\n        if size_range is None:\n            assert t.size() == x.size(), f\"Expected tensors with same size, got {t.size()} and {x.size()}\"\n        else:\n            assert (\n                t.size()[size_range[0] : size_range[1]] == x.size()[size_range[0] : size_range[1]]\n            ), f\"Expected tensors with same size at given dimensions, got {t.size()} and {x.size()}\"\n\n        if dim_range[0] == dim_range[1]:\n            assert t.dim() == dim_range[0], f\"Expected number of dimensions to be {dim_range[0]}, got {t.dim()}\"\n        elif dim_range[0] < dim_range[1]:\n            assert (\n                dim_range[0] <= t.dim() <= dim_range[1]\n            ), f\"Expected number of dimensions to be between {dim_range[0]} and {dim_range[1]}, got {t.dim()}\"\n\n        if data_range[0] < data_range[1]:\n            assert data_range[0] <= t.min(), f\"Expected values to be greater or equal to {data_range[0]}, got {t.min()}\"\n            assert t.max() <= data_range[1], f\"Expected values to be lower or equal to {data_range[1]}, got {t.max()}\"\n\n\ndef gaussian_filter(kernel_size: int, sigma: float) -> torch.Tensor:\n    r\"\"\"Returns 2D Gaussian kernel N(0,`sigma`^2)\n    Args:\n        size: Size of the kernel\n        sigma: Std of the distribution\n    Returns:\n        gaussian_kernel: Tensor with shape (1, kernel_size, kernel_size)\n    \"\"\"\n    coords = torch.arange(kernel_size, dtype=torch.float32)\n    coords -= (kernel_size - 1) / 2.0\n\n    g = coords**2\n    g = (-(g.unsqueeze(0) + g.unsqueeze(1)) / (2 * sigma**2)).exp()\n\n    g /= g.sum()\n    return g.unsqueeze(0)\n\n\ndef ssim(\n    x: torch.Tensor,\n    y: torch.Tensor,\n    kernel_size: int = 11,\n    kernel_sigma: float = 1.5,\n    data_range: Union[int, float] = 1.0,\n    reduction: str = \"mean\",\n    full: bool = False,\n    downsample: bool = True,\n    k1: float = 0.01,\n    k2: float = 0.03,\n) -> List[torch.Tensor]:\n    r\"\"\"Interface of Structural Similarity (SSIM) index.\n    Inputs supposed to be in range ``[0, data_range]``.\n    To match performance with skimage and tensorflow set ``'downsample' = True``.\n\n    Args:\n        x: An input tensor. Shape :math:`(N, C, H, W)` or :math:`(N, C, H, W, 2)`.\n        y: A target tensor. Shape :math:`(N, C, H, W)` or :math:`(N, C, H, W, 2)`.\n        kernel_size: The side-length of the sliding window used in comparison. Must be an odd value.\n        kernel_sigma: Sigma of normal distribution.\n        data_range: Maximum value range of images (usually 1.0 or 255).\n        reduction: Specifies the reduction type:\n            ``'none'`` | ``'mean'`` | ``'sum'``. Default:``'mean'``\n        full: Return cs map or not.\n        downsample: Perform average pool before SSIM computation. Default: True\n        k1: Algorithm parameter, K1 (small constant).\n        k2: Algorithm parameter, K2 (small constant).\n            Try a larger K2 constant (e.g. 0.4) if you get a negative or NaN results.\n\n    Returns:\n        Value of Structural Similarity (SSIM) index. In case of 5D input tensors, complex value is returned\n        as a tensor of size 2.\n\n    References:\n        Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004).\n        Image quality assessment: From error visibility to structural similarity.\n        IEEE Transactions on Image Processing, 13, 600-612.\n        https://ece.uwaterloo.ca/~z70wang/publications/ssim.pdf,\n        DOI: `10.1109/TIP.2003.819861`\n    \"\"\"\n    assert kernel_size % 2 == 1, f\"Kernel size must be odd, got [{kernel_size}]\"\n    _validate_input([x, y], dim_range=(4, 5), data_range=(0, data_range))\n\n    x = x / float(data_range)\n    y = y / float(data_range)\n\n    # Averagepool image if the size is large enough\n    f = max(1, round(min(x.size()[-2:]) / 256))\n    if (f > 1) and downsample:\n        x = F.avg_pool2d(x, kernel_size=f)\n        y = F.avg_pool2d(y, kernel_size=f)\n\n    kernel = gaussian_filter(kernel_size, kernel_sigma).repeat(x.size(1), 1, 1, 1).to(y)\n    _compute_ssim_per_channel = _ssim_per_channel_complex if x.dim() == 5 else _ssim_per_channel\n    ssim_map, cs_map = _compute_ssim_per_channel(x=x, y=y, kernel=kernel, k1=k1, k2=k2)\n    ssim_val = ssim_map.mean(1)\n    cs = cs_map.mean(1)\n\n    ssim_val = _reduce(ssim_val, reduction)\n    cs = _reduce(cs, reduction)\n\n    if full:\n        return [ssim_val, cs]\n\n    return ssim_val\n\n\nclass SSIMLoss(_Loss):\n    r\"\"\"Creates a criterion that measures the structural similarity index error between\n    each element in the input :math:`x` and target :math:`y`.\n\n    To match performance with skimage and tensorflow set ``'downsample' = True``.\n\n    The unreduced (i.e. with :attr:`reduction` set to ``'none'``) loss can be described as:\n\n    .. math::\n        SSIM = \\{ssim_1,\\dots,ssim_{N \\times C}\\}\\\\\n        ssim_{l}(x, y) = \\frac{(2 \\mu_x \\mu_y + c_1) (2 \\sigma_{xy} + c_2)}\n        {(\\mu_x^2 +\\mu_y^2 + c_1)(\\sigma_x^2 +\\sigma_y^2 + c_2)},\n\n    where :math:`N` is the batch size, `C` is the channel size. If :attr:`reduction` is not ``'none'``\n    (default ``'mean'``), then:\n\n    .. math::\n        SSIMLoss(x, y) =\n        \\begin{cases}\n            \\operatorname{mean}(1 - SSIM), &  \\text{if reduction} = \\text{'mean';}\\\\\n            \\operatorname{sum}(1 - SSIM),  &  \\text{if reduction} = \\text{'sum'.}\n        \\end{cases}\n\n    :math:`x` and :math:`y` are tensors of arbitrary shapes with a total\n    of :math:`n` elements each.\n\n    The sum operation still operates over all the elements, and divides by :math:`n`.\n    The division by :math:`n` can be avoided if one sets ``reduction = 'sum'``.\n    In case of 5D input tensors, complex value is returned as a tensor of size 2.\n\n    Args:\n        kernel_size: By default, the mean and covariance of a pixel is obtained\n            by convolution with given filter_size.\n        kernel_sigma: Standard deviation for Gaussian kernel.\n        k1: Coefficient related to c1 in the above equation.\n        k2: Coefficient related to c2 in the above equation.\n        downsample: Perform average pool before SSIM computation. Default: True\n        reduction: Specifies the reduction type:\n            ``'none'`` | ``'mean'`` | ``'sum'``. Default:``'mean'``\n        data_range: Maximum value range of images (usually 1.0 or 255).\n\n    Examples:\n        >>> loss = SSIMLoss()\n        >>> x = torch.rand(3, 3, 256, 256, requires_grad=True)\n        >>> y = torch.rand(3, 3, 256, 256)\n        >>> output = loss(x, y)\n        >>> output.backward()\n\n    References:\n        Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004).\n        Image quality assessment: From error visibility to structural similarity.\n        IEEE Transactions on Image Processing, 13, 600-612.\n        https://ece.uwaterloo.ca/~z70wang/publications/ssim.pdf,\n        DOI:`10.1109/TIP.2003.819861`\n    \"\"\"\n    __constants__ = [\"kernel_size\", \"k1\", \"k2\", \"sigma\", \"kernel\", \"reduction\"]\n\n    def __init__(\n        self,\n        kernel_size: int = 11,\n        kernel_sigma: float = 1.5,\n        k1: float = 0.01,\n        k2: float = 0.03,\n        downsample: bool = True,\n        reduction: str = \"mean\",\n        data_range: Union[int, float] = 1.0,\n    ) -> None:\n        super().__init__()\n\n        # Generic loss parameters.\n        self.reduction = reduction\n\n        # Loss-specific parameters.\n        self.kernel_size = kernel_size\n\n        # This check might look redundant because kernel size is checked within the ssim function anyway.\n        # However, this check allows to fail fast when the loss is being initialised and training has not been started.\n        assert kernel_size % 2 == 1, f\"Kernel size must be odd, got [{kernel_size}]\"\n        self.kernel_sigma = kernel_sigma\n        self.k1 = k1\n        self.k2 = k2\n        self.downsample = downsample\n        self.data_range = data_range\n\n    def forward(self, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:\n        r\"\"\"Computation of Structural Similarity (SSIM) index as a loss function.\n\n        Args:\n            x: An input tensor. Shape :math:`(N, C, H, W)` or :math:`(N, C, H, W, 2)`.\n            y: A target tensor. Shape :math:`(N, C, H, W)` or :math:`(N, C, H, W, 2)`.\n\n        Returns:\n            Value of SSIM loss to be minimized, i.e ``1 - ssim`` in [0, 1] range. In case of 5D input tensors,\n            complex value is returned as a tensor of size 2.\n        \"\"\"\n\n        score = ssim(\n            x=x,\n            y=y,\n            kernel_size=self.kernel_size,\n            kernel_sigma=self.kernel_sigma,\n            downsample=self.downsample,\n            data_range=self.data_range,\n            reduction=self.reduction,\n            full=False,\n            k1=self.k1,\n            k2=self.k2,\n        )\n        return torch.ones_like(score) - score\n\n\ndef _ssim_per_channel(\n    x: torch.Tensor,\n    y: torch.Tensor,\n    kernel: torch.Tensor,\n    k1: float = 0.01,\n    k2: float = 0.03,\n) -> Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]:\n    r\"\"\"Calculate Structural Similarity (SSIM) index for X and Y per channel.\n\n    Args:\n        x: An input tensor. Shape :math:`(N, C, H, W)`.\n        y: A target tensor. Shape :math:`(N, C, H, W)`.\n        kernel: 2D Gaussian kernel.\n        k1: Algorithm parameter, K1 (small constant, see [1]).\n        k2: Algorithm parameter, K2 (small constant, see [1]).\n            Try a larger K2 constant (e.g. 0.4) if you get a negative or NaN results.\n\n    Returns:\n        Full Value of Structural Similarity (SSIM) index.\n    \"\"\"\n    if x.size(-1) < kernel.size(-1) or x.size(-2) < kernel.size(-2):\n        raise ValueError(\n            f\"Kernel size can't be greater than actual input size. Input size: {x.size()}. \"\n            f\"Kernel size: {kernel.size()}\"\n        )\n\n    c1 = k1**2\n    c2 = k2**2\n    n_channels = x.size(1)\n    mu_x = F.conv2d(x, weight=kernel, stride=1, padding=0, groups=n_channels)\n    mu_y = F.conv2d(y, weight=kernel, stride=1, padding=0, groups=n_channels)\n\n    mu_xx = mu_x**2\n    mu_yy = mu_y**2\n    mu_xy = mu_x * mu_y\n\n    sigma_xx = F.conv2d(x**2, weight=kernel, stride=1, padding=0, groups=n_channels) - mu_xx\n    sigma_yy = F.conv2d(y**2, weight=kernel, stride=1, padding=0, groups=n_channels) - mu_yy\n    sigma_xy = F.conv2d(x * y, weight=kernel, stride=1, padding=0, groups=n_channels) - mu_xy\n\n    # Contrast sensitivity (CS) with alpha = beta = gamma = 1.\n    cs = (2.0 * sigma_xy + c2) / (sigma_xx + sigma_yy + c2)\n\n    # Structural similarity (SSIM)\n    ss = (2.0 * mu_xy + c1) / (mu_xx + mu_yy + c1) * cs\n\n    ssim_val = ss.mean(dim=(-1, -2))\n    cs = cs.mean(dim=(-1, -2))\n    return ssim_val, cs\n\n\ndef _ssim_per_channel_complex(\n    x: torch.Tensor,\n    y: torch.Tensor,\n    kernel: torch.Tensor,\n    k1: float = 0.01,\n    k2: float = 0.03,\n) -> Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]:\n    r\"\"\"Calculate Structural Similarity (SSIM) index for Complex X and Y per channel.\n\n    Args:\n        x: An input tensor. Shape :math:`(N, C, H, W, 2)`.\n        y: A target tensor. Shape :math:`(N, C, H, W, 2)`.\n        kernel: 2-D gauss kernel.\n        k1: Algorithm parameter, K1 (small constant, see [1]).\n        k2: Algorithm parameter, K2 (small constant, see [1]).\n            Try a larger K2 constant (e.g. 0.4) if you get a negative or NaN results.\n\n    Returns:\n        Full Value of Complex Structural Similarity (SSIM) index.\n    \"\"\"\n    n_channels = x.size(1)\n    if x.size(-2) < kernel.size(-1) or x.size(-3) < kernel.size(-2):\n        raise ValueError(\n            f\"Kernel size can't be greater than actual input size. Input size: {x.size()}. \"\n            f\"Kernel size: {kernel.size()}\"\n        )\n\n    c1 = k1**2\n    c2 = k2**2\n\n    x_real = x[..., 0]\n    x_imag = x[..., 1]\n    y_real = y[..., 0]\n    y_imag = y[..., 1]\n\n    mu1_real = F.conv2d(x_real, weight=kernel, stride=1, padding=0, groups=n_channels)\n    mu1_imag = F.conv2d(x_imag, weight=kernel, stride=1, padding=0, groups=n_channels)\n    mu2_real = F.conv2d(y_real, weight=kernel, stride=1, padding=0, groups=n_channels)\n    mu2_imag = F.conv2d(y_imag, weight=kernel, stride=1, padding=0, groups=n_channels)\n\n    mu1_sq = mu1_real.pow(2) + mu1_imag.pow(2)\n    mu2_sq = mu2_real.pow(2) + mu2_imag.pow(2)\n    mu1_mu2_real = mu1_real * mu2_real - mu1_imag * mu2_imag\n    mu1_mu2_imag = mu1_real * mu2_imag + mu1_imag * mu2_real\n\n    compensation = 1.0\n\n    x_sq = x_real.pow(2) + x_imag.pow(2)\n    y_sq = y_real.pow(2) + y_imag.pow(2)\n    x_y_real = x_real * y_real - x_imag * y_imag\n    x_y_imag = x_real * y_imag + x_imag * y_real\n\n    sigma1_sq = F.conv2d(x_sq, weight=kernel, stride=1, padding=0, groups=n_channels) - mu1_sq\n    sigma2_sq = F.conv2d(y_sq, weight=kernel, stride=1, padding=0, groups=n_channels) - mu2_sq\n    sigma12_real = F.conv2d(x_y_real, weight=kernel, stride=1, padding=0, groups=n_channels) - mu1_mu2_real\n    sigma12_imag = F.conv2d(x_y_imag, weight=kernel, stride=1, padding=0, groups=n_channels) - mu1_mu2_imag\n    sigma12 = torch.stack((sigma12_imag, sigma12_real), dim=-1)\n    mu1_mu2 = torch.stack((mu1_mu2_real, mu1_mu2_imag), dim=-1)\n    # Set alpha = beta = gamma = 1.\n    cs_map = (sigma12 * 2 + c2 * compensation) / (sigma1_sq.unsqueeze(-1) + sigma2_sq.unsqueeze(-1) + c2 * compensation)\n    ssim_map = (mu1_mu2 * 2 + c1 * compensation) / (mu1_sq.unsqueeze(-1) + mu2_sq.unsqueeze(-1) + c1 * compensation)\n    ssim_map = ssim_map * cs_map\n\n    ssim_val = ssim_map.mean(dim=(-2, -3))\n    cs = cs_map.mean(dim=(-2, -3))\n\n    return ssim_val, cs\n"
  },
  {
    "path": "TTS/tts/utils/synthesis.py",
    "content": "from typing import Dict\n\nimport numpy as np\nimport torch\nfrom torch import nn\n\n\ndef numpy_to_torch(np_array, dtype, cuda=False):\n    if np_array is None:\n        return None\n    tensor = torch.as_tensor(np_array, dtype=dtype)\n    if cuda:\n        return tensor.cuda()\n    return tensor\n\n\ndef compute_style_mel(style_wav, ap, cuda=False):\n    style_mel = torch.FloatTensor(ap.melspectrogram(ap.load_wav(style_wav, sr=ap.sample_rate))).unsqueeze(0)\n    if cuda:\n        return style_mel.cuda()\n    return style_mel\n\n\ndef run_model_torch(\n    model: nn.Module,\n    inputs: torch.Tensor,\n    speaker_id: int = None,\n    style_mel: torch.Tensor = None,\n    style_text: str = None,\n    d_vector: torch.Tensor = None,\n    language_id: torch.Tensor = None,\n) -> Dict:\n    \"\"\"Run a torch model for inference. It does not support batch inference.\n\n    Args:\n        model (nn.Module): The model to run inference.\n        inputs (torch.Tensor): Input tensor with character ids.\n        speaker_id (int, optional): Input speaker ids for multi-speaker models. Defaults to None.\n        style_mel (torch.Tensor, optional): Spectrograms used for voice styling . Defaults to None.\n        d_vector (torch.Tensor, optional): d-vector for multi-speaker models    . Defaults to None.\n\n    Returns:\n        Dict: model outputs.\n    \"\"\"\n    input_lengths = torch.tensor(inputs.shape[1:2]).to(inputs.device)\n    if hasattr(model, \"module\"):\n        _func = model.module.inference\n    else:\n        _func = model.inference\n    outputs = _func(\n        inputs,\n        aux_input={\n            \"x_lengths\": input_lengths,\n            \"speaker_ids\": speaker_id,\n            \"d_vectors\": d_vector,\n            \"style_mel\": style_mel,\n            \"style_text\": style_text,\n            \"language_ids\": language_id,\n        },\n    )\n    return outputs\n\n\ndef trim_silence(wav, ap):\n    return wav[: ap.find_endpoint(wav)]\n\n\ndef inv_spectrogram(postnet_output, ap, CONFIG):\n    if CONFIG.model.lower() in [\"tacotron\"]:\n        wav = ap.inv_spectrogram(postnet_output.T)\n    else:\n        wav = ap.inv_melspectrogram(postnet_output.T)\n    return wav\n\n\ndef id_to_torch(aux_id, cuda=False):\n    if aux_id is not None:\n        aux_id = np.asarray(aux_id)\n        aux_id = torch.from_numpy(aux_id)\n    if cuda:\n        return aux_id.cuda()\n    return aux_id\n\n\ndef embedding_to_torch(d_vector, cuda=False):\n    if d_vector is not None:\n        d_vector = np.asarray(d_vector)\n        d_vector = torch.from_numpy(d_vector).type(torch.FloatTensor)\n        d_vector = d_vector.squeeze().unsqueeze(0)\n    if cuda:\n        return d_vector.cuda()\n    return d_vector\n\n\n# TODO: perform GL with pytorch for batching\ndef apply_griffin_lim(inputs, input_lens, CONFIG, ap):\n    \"\"\"Apply griffin-lim to each sample iterating throught the first dimension.\n    Args:\n        inputs (Tensor or np.Array): Features to be converted by GL. First dimension is the batch size.\n        input_lens (Tensor or np.Array): 1D array of sample lengths.\n        CONFIG (Dict): TTS config.\n        ap (AudioProcessor): TTS audio processor.\n    \"\"\"\n    wavs = []\n    for idx, spec in enumerate(inputs):\n        wav_len = (input_lens[idx] * ap.hop_length) - ap.hop_length  # inverse librosa padding\n        wav = inv_spectrogram(spec, ap, CONFIG)\n        # assert len(wav) == wav_len, f\" [!] wav lenght: {len(wav)} vs expected: {wav_len}\"\n        wavs.append(wav[:wav_len])\n    return wavs\n\n\ndef synthesis(\n    model,\n    text,\n    CONFIG,\n    use_cuda,\n    speaker_id=None,\n    style_wav=None,\n    style_text=None,\n    use_griffin_lim=False,\n    do_trim_silence=False,\n    d_vector=None,\n    language_id=None,\n):\n    \"\"\"Synthesize voice for the given text using Griffin-Lim vocoder or just compute output features to be passed to\n    the vocoder model.\n\n    Args:\n        model (TTS.tts.models):\n            The TTS model to synthesize audio with.\n\n        text (str):\n            The input text to convert to speech.\n\n        CONFIG (Coqpit):\n            Model configuration.\n\n        use_cuda (bool):\n            Enable/disable CUDA.\n\n        speaker_id (int):\n            Speaker ID passed to the speaker embedding layer in multi-speaker model. Defaults to None.\n\n        style_wav (str | Dict[str, float]):\n            Path or tensor to/of a waveform used for computing the style embedding based on GST or Capacitron.\n            Defaults to None, meaning that Capacitron models will sample from the prior distribution to\n            generate random but realistic prosody.\n\n        style_text (str):\n            Transcription of style_wav for Capacitron models. Defaults to None.\n\n        enable_eos_bos_chars (bool):\n            enable special chars for end of sentence and start of sentence. Defaults to False.\n\n        do_trim_silence (bool):\n            trim silence after synthesis. Defaults to False.\n\n        d_vector (torch.Tensor):\n            d-vector for multi-speaker models in share :math:`[1, D]`. Defaults to None.\n\n        language_id (int):\n            Language ID passed to the language embedding layer in multi-langual model. Defaults to None.\n    \"\"\"\n    # GST or Capacitron processing\n    # TODO: need to handle the case of setting both gst and capacitron to true somewhere\n    style_mel = None\n    if CONFIG.has(\"gst\") and CONFIG.gst and style_wav is not None:\n        if isinstance(style_wav, dict):\n            style_mel = style_wav\n        else:\n            style_mel = compute_style_mel(style_wav, model.ap, cuda=use_cuda)\n\n    if CONFIG.has(\"capacitron_vae\") and CONFIG.use_capacitron_vae and style_wav is not None:\n        style_mel = compute_style_mel(style_wav, model.ap, cuda=use_cuda)\n        style_mel = style_mel.transpose(1, 2)  # [1, time, depth]\n\n    language_name = None\n    if language_id is not None:\n        language = [k for k, v in model.language_manager.name_to_id.items() if v == language_id]\n        assert len(language) == 1, \"language_id must be a valid language\"\n        language_name = language[0]\n\n    # convert text to sequence of token IDs\n    text_inputs = np.asarray(\n        model.tokenizer.text_to_ids(text, language=language_name),\n        dtype=np.int32,\n    )\n    # pass tensors to backend\n    if speaker_id is not None:\n        speaker_id = id_to_torch(speaker_id, cuda=use_cuda)\n\n    if d_vector is not None:\n        d_vector = embedding_to_torch(d_vector, cuda=use_cuda)\n\n    if language_id is not None:\n        language_id = id_to_torch(language_id, cuda=use_cuda)\n\n    if not isinstance(style_mel, dict):\n        # GST or Capacitron style mel\n        style_mel = numpy_to_torch(style_mel, torch.float, cuda=use_cuda)\n        if style_text is not None:\n            style_text = np.asarray(\n                model.tokenizer.text_to_ids(style_text, language=language_id),\n                dtype=np.int32,\n            )\n            style_text = numpy_to_torch(style_text, torch.long, cuda=use_cuda)\n            style_text = style_text.unsqueeze(0)\n\n    text_inputs = numpy_to_torch(text_inputs, torch.long, cuda=use_cuda)\n    text_inputs = text_inputs.unsqueeze(0)\n    # synthesize voice\n    outputs = run_model_torch(\n        model,\n        text_inputs,\n        speaker_id,\n        style_mel,\n        style_text,\n        d_vector=d_vector,\n        language_id=language_id,\n    )\n    model_outputs = outputs[\"model_outputs\"]\n    model_outputs = model_outputs[0].data.cpu().numpy()\n    alignments = outputs[\"alignments\"]\n\n    # convert outputs to numpy\n    # plot results\n    wav = None\n    model_outputs = model_outputs.squeeze()\n    if model_outputs.ndim == 2:  # [T, C_spec]\n        if use_griffin_lim:\n            wav = inv_spectrogram(model_outputs, model.ap, CONFIG)\n            # trim silence\n            if do_trim_silence:\n                wav = trim_silence(wav, model.ap)\n    else:  # [T,]\n        wav = model_outputs\n    return_dict = {\n        \"wav\": wav,\n        \"alignments\": alignments,\n        \"text_inputs\": text_inputs,\n        \"outputs\": outputs,\n    }\n    return return_dict\n\n\ndef transfer_voice(\n    model,\n    CONFIG,\n    use_cuda,\n    reference_wav,\n    speaker_id=None,\n    d_vector=None,\n    reference_speaker_id=None,\n    reference_d_vector=None,\n    do_trim_silence=False,\n    use_griffin_lim=False,\n):\n    \"\"\"Synthesize voice for the given text using Griffin-Lim vocoder or just compute output features to be passed to\n    the vocoder model.\n\n    Args:\n        model (TTS.tts.models):\n            The TTS model to synthesize audio with.\n\n        CONFIG (Coqpit):\n            Model configuration.\n\n        use_cuda (bool):\n            Enable/disable CUDA.\n\n        reference_wav (str):\n            Path of reference_wav to be used to voice conversion.\n\n        speaker_id (int):\n            Speaker ID passed to the speaker embedding layer in multi-speaker model. Defaults to None.\n\n        d_vector (torch.Tensor):\n            d-vector for multi-speaker models in share :math:`[1, D]`. Defaults to None.\n\n        reference_speaker_id (int):\n            Reference Speaker ID passed to the speaker embedding layer in multi-speaker model. Defaults to None.\n\n        reference_d_vector (torch.Tensor):\n            Reference d-vector for multi-speaker models in share :math:`[1, D]`. Defaults to None.\n\n        enable_eos_bos_chars (bool):\n            enable special chars for end of sentence and start of sentence. Defaults to False.\n\n        do_trim_silence (bool):\n            trim silence after synthesis. Defaults to False.\n    \"\"\"\n    # pass tensors to backend\n    if speaker_id is not None:\n        speaker_id = id_to_torch(speaker_id, cuda=use_cuda)\n\n    if d_vector is not None:\n        d_vector = embedding_to_torch(d_vector, cuda=use_cuda)\n\n    if reference_d_vector is not None:\n        reference_d_vector = embedding_to_torch(reference_d_vector, cuda=use_cuda)\n\n    # load reference_wav audio\n    reference_wav = embedding_to_torch(\n        model.ap.load_wav(\n            reference_wav, sr=model.args.encoder_sample_rate if model.args.encoder_sample_rate else model.ap.sample_rate\n        ),\n        cuda=use_cuda,\n    )\n\n    if hasattr(model, \"module\"):\n        _func = model.module.inference_voice_conversion\n    else:\n        _func = model.inference_voice_conversion\n    model_outputs = _func(reference_wav, speaker_id, d_vector, reference_speaker_id, reference_d_vector)\n\n    # convert outputs to numpy\n    # plot results\n    wav = None\n    model_outputs = model_outputs.squeeze()\n    if model_outputs.ndim == 2:  # [T, C_spec]\n        if use_griffin_lim:\n            wav = inv_spectrogram(model_outputs, model.ap, CONFIG)\n            # trim silence\n            if do_trim_silence:\n                wav = trim_silence(wav, model.ap)\n    else:  # [T,]\n        wav = model_outputs\n\n    return wav\n"
  },
  {
    "path": "TTS/tts/utils/text/__init__.py",
    "content": "from TTS.tts.utils.text.tokenizer import TTSTokenizer\n"
  },
  {
    "path": "TTS/tts/utils/text/characters.py",
    "content": "from dataclasses import replace\nfrom typing import Dict\n\nfrom TTS.tts.configs.shared_configs import CharactersConfig\n\n\ndef parse_symbols():\n    return {\n        \"pad\": _pad,\n        \"eos\": _eos,\n        \"bos\": _bos,\n        \"characters\": _characters,\n        \"punctuations\": _punctuations,\n        \"phonemes\": _phonemes,\n    }\n\n\n# DEFAULT SET OF GRAPHEMES\n_pad = \"<PAD>\"\n_eos = \"<EOS>\"\n_bos = \"<BOS>\"\n_blank = \"<BLNK>\"  # TODO: check if we need this alongside with PAD\n_characters = \"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\"\n_punctuations = \"!'(),-.:;? \"\n\n\n# DEFAULT SET OF IPA PHONEMES\n# Phonemes definition (All IPA characters)\n_vowels = \"iyɨʉɯuɪʏʊeøɘəɵɤoɛœɜɞʌɔæɐaɶɑɒᵻ\"\n_non_pulmonic_consonants = \"ʘɓǀɗǃʄǂɠǁʛ\"\n_pulmonic_consonants = \"pbtdʈɖcɟkɡqɢʔɴŋɲɳnɱmʙrʀⱱɾɽɸβfvθðszʃʒʂʐçʝxɣχʁħʕhɦɬɮʋɹɻjɰlɭʎʟ\"\n_suprasegmentals = \"ˈˌːˑ\"\n_other_symbols = \"ʍwɥʜʢʡɕʑɺɧʲ\"\n_diacrilics = \"ɚ˞ɫ\"\n_phonemes = _vowels + _non_pulmonic_consonants + _pulmonic_consonants + _suprasegmentals + _other_symbols + _diacrilics\n\n\nclass BaseVocabulary:\n    \"\"\"Base Vocabulary class.\n\n    This class only needs a vocabulary dictionary without specifying the characters.\n\n    Args:\n        vocab (Dict): A dictionary of characters and their corresponding indices.\n    \"\"\"\n\n    def __init__(self, vocab: Dict, pad: str = None, blank: str = None, bos: str = None, eos: str = None):\n        self.vocab = vocab\n        self.pad = pad\n        self.blank = blank\n        self.bos = bos\n        self.eos = eos\n\n    @property\n    def pad_id(self) -> int:\n        \"\"\"Return the index of the padding character. If the padding character is not specified, return the length\n        of the vocabulary.\"\"\"\n        return self.char_to_id(self.pad) if self.pad else len(self.vocab)\n\n    @property\n    def blank_id(self) -> int:\n        \"\"\"Return the index of the blank character. If the blank character is not specified, return the length of\n        the vocabulary.\"\"\"\n        return self.char_to_id(self.blank) if self.blank else len(self.vocab)\n\n    @property\n    def vocab(self):\n        \"\"\"Return the vocabulary dictionary.\"\"\"\n        return self._vocab\n\n    @vocab.setter\n    def vocab(self, vocab):\n        \"\"\"Set the vocabulary dictionary and character mapping dictionaries.\"\"\"\n        self._vocab = vocab\n        self._char_to_id = {char: idx for idx, char in enumerate(self._vocab)}\n        self._id_to_char = {\n            idx: char for idx, char in enumerate(self._vocab)  # pylint: disable=unnecessary-comprehension\n        }\n\n    @staticmethod\n    def init_from_config(config, **kwargs):\n        \"\"\"Initialize from the given config.\"\"\"\n        if config.characters is not None and \"vocab_dict\" in config.characters and config.characters.vocab_dict:\n            return (\n                BaseVocabulary(\n                    config.characters.vocab_dict,\n                    config.characters.pad,\n                    config.characters.blank,\n                    config.characters.bos,\n                    config.characters.eos,\n                ),\n                config,\n            )\n        return BaseVocabulary(**kwargs), config\n\n    @property\n    def num_chars(self):\n        \"\"\"Return number of tokens in the vocabulary.\"\"\"\n        return len(self._vocab)\n\n    def char_to_id(self, char: str) -> int:\n        \"\"\"Map a character to an token ID.\"\"\"\n        try:\n            return self._char_to_id[char]\n        except KeyError as e:\n            raise KeyError(f\" [!] {repr(char)} is not in the vocabulary.\") from e\n\n    def id_to_char(self, idx: int) -> str:\n        \"\"\"Map an token ID to a character.\"\"\"\n        return self._id_to_char[idx]\n\n\nclass BaseCharacters:\n    \"\"\"🐸BaseCharacters class\n\n        Every new character class should inherit from this.\n\n        Characters are oredered as follows ```[PAD, EOS, BOS, BLANK, CHARACTERS, PUNCTUATIONS]```.\n\n        If you need a custom order, you need to define inherit from this class and override the ```_create_vocab``` method.\n\n        Args:\n            characters (str):\n                Main set of characters to be used in the vocabulary.\n\n            punctuations (str):\n                Characters to be treated as punctuation.\n\n            pad (str):\n                Special padding character that would be ignored by the model.\n\n            eos (str):\n                End of the sentence character.\n\n            bos (str):\n                Beginning of the sentence character.\n\n            blank (str):\n                Optional character used between characters by some models for better prosody.\n\n            is_unique (bool):\n                Remove duplicates from the provided characters. Defaults to True.\n    el\n            is_sorted (bool):\n                Sort the characters in alphabetical order. Only applies to `self.characters`. Defaults to True.\n    \"\"\"\n\n    def __init__(\n        self,\n        characters: str = None,\n        punctuations: str = None,\n        pad: str = None,\n        eos: str = None,\n        bos: str = None,\n        blank: str = None,\n        is_unique: bool = False,\n        is_sorted: bool = True,\n    ) -> None:\n        self._characters = characters\n        self._punctuations = punctuations\n        self._pad = pad\n        self._eos = eos\n        self._bos = bos\n        self._blank = blank\n        self.is_unique = is_unique\n        self.is_sorted = is_sorted\n        self._create_vocab()\n\n    @property\n    def pad_id(self) -> int:\n        return self.char_to_id(self.pad) if self.pad else len(self.vocab)\n\n    @property\n    def blank_id(self) -> int:\n        return self.char_to_id(self.blank) if self.blank else len(self.vocab)\n\n    @property\n    def characters(self):\n        return self._characters\n\n    @characters.setter\n    def characters(self, characters):\n        self._characters = characters\n        self._create_vocab()\n\n    @property\n    def punctuations(self):\n        return self._punctuations\n\n    @punctuations.setter\n    def punctuations(self, punctuations):\n        self._punctuations = punctuations\n        self._create_vocab()\n\n    @property\n    def pad(self):\n        return self._pad\n\n    @pad.setter\n    def pad(self, pad):\n        self._pad = pad\n        self._create_vocab()\n\n    @property\n    def eos(self):\n        return self._eos\n\n    @eos.setter\n    def eos(self, eos):\n        self._eos = eos\n        self._create_vocab()\n\n    @property\n    def bos(self):\n        return self._bos\n\n    @bos.setter\n    def bos(self, bos):\n        self._bos = bos\n        self._create_vocab()\n\n    @property\n    def blank(self):\n        return self._blank\n\n    @blank.setter\n    def blank(self, blank):\n        self._blank = blank\n        self._create_vocab()\n\n    @property\n    def vocab(self):\n        return self._vocab\n\n    @vocab.setter\n    def vocab(self, vocab):\n        self._vocab = vocab\n        self._char_to_id = {char: idx for idx, char in enumerate(self.vocab)}\n        self._id_to_char = {\n            idx: char for idx, char in enumerate(self.vocab)  # pylint: disable=unnecessary-comprehension\n        }\n\n    @property\n    def num_chars(self):\n        return len(self._vocab)\n\n    def _create_vocab(self):\n        _vocab = self._characters\n        if self.is_unique:\n            _vocab = list(set(_vocab))\n        if self.is_sorted:\n            _vocab = sorted(_vocab)\n        _vocab = list(_vocab)\n        _vocab = [self._blank] + _vocab if self._blank is not None and len(self._blank) > 0 else _vocab\n        _vocab = [self._bos] + _vocab if self._bos is not None and len(self._bos) > 0 else _vocab\n        _vocab = [self._eos] + _vocab if self._eos is not None and len(self._eos) > 0 else _vocab\n        _vocab = [self._pad] + _vocab if self._pad is not None and len(self._pad) > 0 else _vocab\n        self.vocab = _vocab + list(self._punctuations)\n        if self.is_unique:\n            duplicates = {x for x in self.vocab if self.vocab.count(x) > 1}\n            assert (\n                len(self.vocab) == len(self._char_to_id) == len(self._id_to_char)\n            ), f\" [!] There are duplicate characters in the character set. {duplicates}\"\n\n    def char_to_id(self, char: str) -> int:\n        try:\n            return self._char_to_id[char]\n        except KeyError as e:\n            raise KeyError(f\" [!] {repr(char)} is not in the vocabulary.\") from e\n\n    def id_to_char(self, idx: int) -> str:\n        return self._id_to_char[idx]\n\n    def print_log(self, level: int = 0):\n        \"\"\"\n        Prints the vocabulary in a nice format.\n        \"\"\"\n        indent = \"\\t\" * level\n        print(f\"{indent}| > Characters: {self._characters}\")\n        print(f\"{indent}| > Punctuations: {self._punctuations}\")\n        print(f\"{indent}| > Pad: {self._pad}\")\n        print(f\"{indent}| > EOS: {self._eos}\")\n        print(f\"{indent}| > BOS: {self._bos}\")\n        print(f\"{indent}| > Blank: {self._blank}\")\n        print(f\"{indent}| > Vocab: {self.vocab}\")\n        print(f\"{indent}| > Num chars: {self.num_chars}\")\n\n    @staticmethod\n    def init_from_config(config: \"Coqpit\"):  # pylint: disable=unused-argument\n        \"\"\"Init your character class from a config.\n\n        Implement this method for your subclass.\n        \"\"\"\n        # use character set from config\n        if config.characters is not None:\n            return BaseCharacters(**config.characters), config\n        # return default character set\n        characters = BaseCharacters()\n        new_config = replace(config, characters=characters.to_config())\n        return characters, new_config\n\n    def to_config(self) -> \"CharactersConfig\":\n        return CharactersConfig(\n            characters=self._characters,\n            punctuations=self._punctuations,\n            pad=self._pad,\n            eos=self._eos,\n            bos=self._bos,\n            blank=self._blank,\n            is_unique=self.is_unique,\n            is_sorted=self.is_sorted,\n        )\n\n\nclass IPAPhonemes(BaseCharacters):\n    \"\"\"🐸IPAPhonemes class to manage `TTS.tts` model vocabulary\n\n    Intended to be used with models using IPAPhonemes as input.\n    It uses system defaults for the undefined class arguments.\n\n    Args:\n        characters (str):\n            Main set of case-sensitive characters to be used in the vocabulary. Defaults to `_phonemes`.\n\n        punctuations (str):\n            Characters to be treated as punctuation. Defaults to `_punctuations`.\n\n        pad (str):\n            Special padding character that would be ignored by the model. Defaults to `_pad`.\n\n        eos (str):\n            End of the sentence character. Defaults to `_eos`.\n\n        bos (str):\n            Beginning of the sentence character. Defaults to `_bos`.\n\n        blank (str):\n            Optional character used between characters by some models for better prosody. Defaults to `_blank`.\n\n        is_unique (bool):\n            Remove duplicates from the provided characters. Defaults to True.\n\n        is_sorted (bool):\n            Sort the characters in alphabetical order. Defaults to True.\n    \"\"\"\n\n    def __init__(\n        self,\n        characters: str = _phonemes,\n        punctuations: str = _punctuations,\n        pad: str = _pad,\n        eos: str = _eos,\n        bos: str = _bos,\n        blank: str = _blank,\n        is_unique: bool = False,\n        is_sorted: bool = True,\n    ) -> None:\n        super().__init__(characters, punctuations, pad, eos, bos, blank, is_unique, is_sorted)\n\n    @staticmethod\n    def init_from_config(config: \"Coqpit\"):\n        \"\"\"Init a IPAPhonemes object from a model config\n\n        If characters are not defined in the config, it will be set to the default characters and the config\n        will be updated.\n        \"\"\"\n        # band-aid for compatibility with old models\n        if \"characters\" in config and config.characters is not None:\n            if \"phonemes\" in config.characters and config.characters.phonemes is not None:\n                config.characters[\"characters\"] = config.characters[\"phonemes\"]\n            return (\n                IPAPhonemes(\n                    characters=config.characters[\"characters\"],\n                    punctuations=config.characters[\"punctuations\"],\n                    pad=config.characters[\"pad\"],\n                    eos=config.characters[\"eos\"],\n                    bos=config.characters[\"bos\"],\n                    blank=config.characters[\"blank\"],\n                    is_unique=config.characters[\"is_unique\"],\n                    is_sorted=config.characters[\"is_sorted\"],\n                ),\n                config,\n            )\n        # use character set from config\n        if config.characters is not None:\n            return IPAPhonemes(**config.characters), config\n        # return default character set\n        characters = IPAPhonemes()\n        new_config = replace(config, characters=characters.to_config())\n        return characters, new_config\n\n\nclass Graphemes(BaseCharacters):\n    \"\"\"🐸Graphemes class to manage `TTS.tts` model vocabulary\n\n    Intended to be used with models using graphemes as input.\n    It uses system defaults for the undefined class arguments.\n\n    Args:\n        characters (str):\n            Main set of case-sensitive characters to be used in the vocabulary. Defaults to `_characters`.\n\n        punctuations (str):\n            Characters to be treated as punctuation. Defaults to `_punctuations`.\n\n        pad (str):\n            Special padding character that would be ignored by the model. Defaults to `_pad`.\n\n        eos (str):\n            End of the sentence character. Defaults to `_eos`.\n\n        bos (str):\n            Beginning of the sentence character. Defaults to `_bos`.\n\n        is_unique (bool):\n            Remove duplicates from the provided characters. Defaults to True.\n\n        is_sorted (bool):\n            Sort the characters in alphabetical order. Defaults to True.\n    \"\"\"\n\n    def __init__(\n        self,\n        characters: str = _characters,\n        punctuations: str = _punctuations,\n        pad: str = _pad,\n        eos: str = _eos,\n        bos: str = _bos,\n        blank: str = _blank,\n        is_unique: bool = False,\n        is_sorted: bool = True,\n    ) -> None:\n        super().__init__(characters, punctuations, pad, eos, bos, blank, is_unique, is_sorted)\n\n    @staticmethod\n    def init_from_config(config: \"Coqpit\"):\n        \"\"\"Init a Graphemes object from a model config\n\n        If characters are not defined in the config, it will be set to the default characters and the config\n        will be updated.\n        \"\"\"\n        if config.characters is not None:\n            # band-aid for compatibility with old models\n            if \"phonemes\" in config.characters:\n                return (\n                    Graphemes(\n                        characters=config.characters[\"characters\"],\n                        punctuations=config.characters[\"punctuations\"],\n                        pad=config.characters[\"pad\"],\n                        eos=config.characters[\"eos\"],\n                        bos=config.characters[\"bos\"],\n                        blank=config.characters[\"blank\"],\n                        is_unique=config.characters[\"is_unique\"],\n                        is_sorted=config.characters[\"is_sorted\"],\n                    ),\n                    config,\n                )\n            return Graphemes(**config.characters), config\n        characters = Graphemes()\n        new_config = replace(config, characters=characters.to_config())\n        return characters, new_config\n\n\nif __name__ == \"__main__\":\n    gr = Graphemes()\n    ph = IPAPhonemes()\n    gr.print_log()\n    ph.print_log()\n"
  },
  {
    "path": "TTS/tts/utils/text/chinese_mandarin/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/utils/text/chinese_mandarin/numbers.py",
    "content": "#!/usr/bin/env python3\n# -*- coding: utf-8 -*-\n\n# Licensed under WTFPL or the Unlicense or CC0.\n# This uses Python 3, but it's easy to port to Python 2 by changing\n# strings to u'xx'.\n\nimport itertools\nimport re\n\n\ndef _num2chinese(num: str, big=False, simp=True, o=False, twoalt=False) -> str:\n    \"\"\"Convert numerical arabic numbers (0->9) to chinese hanzi numbers (〇 -> 九)\n\n    Args:\n        num (str): arabic number to convert\n        big (bool, optional): use financial characters. Defaults to False.\n        simp (bool, optional): use simplified characters instead of tradictional characters. Defaults to True.\n        o (bool, optional): use 〇 for 'zero'. Defaults to False.\n        twoalt (bool, optional): use 两/兩 for 'two' when appropriate. Defaults to False.\n\n    Raises:\n        ValueError: if number is more than 1e48\n        ValueError: if 'e' exposent in number\n\n    Returns:\n        str: converted number as hanzi characters\n    \"\"\"\n\n    # check num first\n    nd = str(num)\n    if abs(float(nd)) >= 1e48:\n        raise ValueError(\"number out of range\")\n    if \"e\" in nd:\n        raise ValueError(\"scientific notation is not supported\")\n    c_symbol = \"正负点\" if simp else \"正負點\"\n    if o:  # formal\n        twoalt = False\n    if big:\n        c_basic = \"零壹贰叁肆伍陆柒捌玖\" if simp else \"零壹貳參肆伍陸柒捌玖\"\n        c_unit1 = \"拾佰仟\"\n        c_twoalt = \"贰\" if simp else \"貳\"\n    else:\n        c_basic = \"〇一二三四五六七八九\" if o else \"零一二三四五六七八九\"\n        c_unit1 = \"十百千\"\n        if twoalt:\n            c_twoalt = \"两\" if simp else \"兩\"\n        else:\n            c_twoalt = \"二\"\n    c_unit2 = \"万亿兆京垓秭穰沟涧正载\" if simp else \"萬億兆京垓秭穰溝澗正載\"\n    revuniq = lambda l: \"\".join(k for k, g in itertools.groupby(reversed(l)))\n    nd = str(num)\n    result = []\n    if nd[0] == \"+\":\n        result.append(c_symbol[0])\n    elif nd[0] == \"-\":\n        result.append(c_symbol[1])\n    if \".\" in nd:\n        integer, remainder = nd.lstrip(\"+-\").split(\".\")\n    else:\n        integer, remainder = nd.lstrip(\"+-\"), None\n    if int(integer):\n        splitted = [integer[max(i - 4, 0) : i] for i in range(len(integer), 0, -4)]\n        intresult = []\n        for nu, unit in enumerate(splitted):\n            # special cases\n            if int(unit) == 0:  # 0000\n                intresult.append(c_basic[0])\n                continue\n            if nu > 0 and int(unit) == 2:  # 0002\n                intresult.append(c_twoalt + c_unit2[nu - 1])\n                continue\n            ulist = []\n            unit = unit.zfill(4)\n            for nc, ch in enumerate(reversed(unit)):\n                if ch == \"0\":\n                    if ulist:  # ???0\n                        ulist.append(c_basic[0])\n                elif nc == 0:\n                    ulist.append(c_basic[int(ch)])\n                elif nc == 1 and ch == \"1\" and unit[1] == \"0\":\n                    # special case for tens\n                    # edit the 'elif' if you don't like\n                    # 十四, 三千零十四, 三千三百一十四\n                    ulist.append(c_unit1[0])\n                elif nc > 1 and ch == \"2\":\n                    ulist.append(c_twoalt + c_unit1[nc - 1])\n                else:\n                    ulist.append(c_basic[int(ch)] + c_unit1[nc - 1])\n            ustr = revuniq(ulist)\n            if nu == 0:\n                intresult.append(ustr)\n            else:\n                intresult.append(ustr + c_unit2[nu - 1])\n        result.append(revuniq(intresult).strip(c_basic[0]))\n    else:\n        result.append(c_basic[0])\n    if remainder:\n        result.append(c_symbol[2])\n        result.append(\"\".join(c_basic[int(ch)] for ch in remainder))\n    return \"\".join(result)\n\n\ndef _number_replace(match) -> str:\n    \"\"\"function to apply in a match, transform all numbers in a match by chinese characters\n\n    Args:\n        match (re.Match): numbers regex matches\n\n    Returns:\n        str: replaced characters for the numbers\n    \"\"\"\n    match_str: str = match.group()\n    return _num2chinese(match_str)\n\n\ndef replace_numbers_to_characters_in_text(text: str) -> str:\n    \"\"\"Replace all arabic numbers in a text by their equivalent in chinese characters (simplified)\n\n    Args:\n        text (str): input text to transform\n\n    Returns:\n        str: output text\n    \"\"\"\n    text = re.sub(r\"[0-9]+\", _number_replace, text)\n    return text\n"
  },
  {
    "path": "TTS/tts/utils/text/chinese_mandarin/phonemizer.py",
    "content": "from typing import List\n\nimport jieba\nimport pypinyin\n\nfrom .pinyinToPhonemes import PINYIN_DICT\n\n\ndef _chinese_character_to_pinyin(text: str) -> List[str]:\n    pinyins = pypinyin.pinyin(text, style=pypinyin.Style.TONE3, heteronym=False, neutral_tone_with_five=True)\n    pinyins_flat_list = [item for sublist in pinyins for item in sublist]\n    return pinyins_flat_list\n\n\ndef _chinese_pinyin_to_phoneme(pinyin: str) -> str:\n    segment = pinyin[:-1]\n    tone = pinyin[-1]\n    phoneme = PINYIN_DICT.get(segment, [\"\"])[0]\n    return phoneme + tone\n\n\ndef chinese_text_to_phonemes(text: str, seperator: str = \"|\") -> str:\n    tokenized_text = jieba.cut(text, HMM=False)\n    tokenized_text = \" \".join(tokenized_text)\n    pinyined_text: List[str] = _chinese_character_to_pinyin(tokenized_text)\n\n    results: List[str] = []\n\n    for token in pinyined_text:\n        if token[-1] in \"12345\":  # TODO transform to is_pinyin()\n            pinyin_phonemes = _chinese_pinyin_to_phoneme(token)\n\n            results += list(pinyin_phonemes)\n        else:  # is ponctuation or other\n            results += list(token)\n\n    return seperator.join(results)\n"
  },
  {
    "path": "TTS/tts/utils/text/chinese_mandarin/pinyinToPhonemes.py",
    "content": "PINYIN_DICT = {\n    \"a\": [\"a\"],\n    \"ai\": [\"ai\"],\n    \"an\": [\"an\"],\n    \"ang\": [\"ɑŋ\"],\n    \"ao\": [\"aʌ\"],\n    \"ba\": [\"ba\"],\n    \"bai\": [\"bai\"],\n    \"ban\": [\"ban\"],\n    \"bang\": [\"bɑŋ\"],\n    \"bao\": [\"baʌ\"],\n    # \"be\": [\"be\"], doesnt exist\n    \"bei\": [\"bɛi\"],\n    \"ben\": [\"bœn\"],\n    \"beng\": [\"bɵŋ\"],\n    \"bi\": [\"bi\"],\n    \"bian\": [\"biɛn\"],\n    \"biao\": [\"biaʌ\"],\n    \"bie\": [\"bie\"],\n    \"bin\": [\"bin\"],\n    \"bing\": [\"bɨŋ\"],\n    \"bo\": [\"bo\"],\n    \"bu\": [\"bu\"],\n    \"ca\": [\"tsa\"],\n    \"cai\": [\"tsai\"],\n    \"can\": [\"tsan\"],\n    \"cang\": [\"tsɑŋ\"],\n    \"cao\": [\"tsaʌ\"],\n    \"ce\": [\"tsø\"],\n    \"cen\": [\"tsœn\"],\n    \"ceng\": [\"tsɵŋ\"],\n    \"cha\": [\"ʈʂa\"],\n    \"chai\": [\"ʈʂai\"],\n    \"chan\": [\"ʈʂan\"],\n    \"chang\": [\"ʈʂɑŋ\"],\n    \"chao\": [\"ʈʂaʌ\"],\n    \"che\": [\"ʈʂø\"],\n    \"chen\": [\"ʈʂœn\"],\n    \"cheng\": [\"ʈʂɵŋ\"],\n    \"chi\": [\"ʈʂʏ\"],\n    \"chong\": [\"ʈʂoŋ\"],\n    \"chou\": [\"ʈʂou\"],\n    \"chu\": [\"ʈʂu\"],\n    \"chua\": [\"ʈʂua\"],\n    \"chuai\": [\"ʈʂuai\"],\n    \"chuan\": [\"ʈʂuan\"],\n    \"chuang\": [\"ʈʂuɑŋ\"],\n    \"chui\": [\"ʈʂuei\"],\n    \"chun\": [\"ʈʂun\"],\n    \"chuo\": [\"ʈʂuo\"],\n    \"ci\": [\"tsɪ\"],\n    \"cong\": [\"tsoŋ\"],\n    \"cou\": [\"tsou\"],\n    \"cu\": [\"tsu\"],\n    \"cuan\": [\"tsuan\"],\n    \"cui\": [\"tsuei\"],\n    \"cun\": [\"tsun\"],\n    \"cuo\": [\"tsuo\"],\n    \"da\": [\"da\"],\n    \"dai\": [\"dai\"],\n    \"dan\": [\"dan\"],\n    \"dang\": [\"dɑŋ\"],\n    \"dao\": [\"daʌ\"],\n    \"de\": [\"dø\"],\n    \"dei\": [\"dei\"],\n    # \"den\": [\"dœn\"],\n    \"deng\": [\"dɵŋ\"],\n    \"di\": [\"di\"],\n    \"dia\": [\"dia\"],\n    \"dian\": [\"diɛn\"],\n    \"diao\": [\"diaʌ\"],\n    \"die\": [\"die\"],\n    \"ding\": [\"dɨŋ\"],\n    \"diu\": [\"dio\"],\n    \"dong\": [\"doŋ\"],\n    \"dou\": [\"dou\"],\n    \"du\": [\"du\"],\n    \"duan\": [\"duan\"],\n    \"dui\": [\"duei\"],\n    \"dun\": [\"dun\"],\n    \"duo\": [\"duo\"],\n    \"e\": [\"ø\"],\n    \"ei\": [\"ei\"],\n    \"en\": [\"œn\"],\n    # \"ng\": [\"œn\"],\n    # \"eng\": [\"ɵŋ\"],\n    \"er\": [\"er\"],\n    \"fa\": [\"fa\"],\n    \"fan\": [\"fan\"],\n    \"fang\": [\"fɑŋ\"],\n    \"fei\": [\"fei\"],\n    \"fen\": [\"fœn\"],\n    \"feng\": [\"fɵŋ\"],\n    \"fo\": [\"fo\"],\n    \"fou\": [\"fou\"],\n    \"fu\": [\"fu\"],\n    \"ga\": [\"ga\"],\n    \"gai\": [\"gai\"],\n    \"gan\": [\"gan\"],\n    \"gang\": [\"gɑŋ\"],\n    \"gao\": [\"gaʌ\"],\n    \"ge\": [\"gø\"],\n    \"gei\": [\"gei\"],\n    \"gen\": [\"gœn\"],\n    \"geng\": [\"gɵŋ\"],\n    \"gong\": [\"goŋ\"],\n    \"gou\": [\"gou\"],\n    \"gu\": [\"gu\"],\n    \"gua\": [\"gua\"],\n    \"guai\": [\"guai\"],\n    \"guan\": [\"guan\"],\n    \"guang\": [\"guɑŋ\"],\n    \"gui\": [\"guei\"],\n    \"gun\": [\"gun\"],\n    \"guo\": [\"guo\"],\n    \"ha\": [\"xa\"],\n    \"hai\": [\"xai\"],\n    \"han\": [\"xan\"],\n    \"hang\": [\"xɑŋ\"],\n    \"hao\": [\"xaʌ\"],\n    \"he\": [\"xø\"],\n    \"hei\": [\"xei\"],\n    \"hen\": [\"xœn\"],\n    \"heng\": [\"xɵŋ\"],\n    \"hong\": [\"xoŋ\"],\n    \"hou\": [\"xou\"],\n    \"hu\": [\"xu\"],\n    \"hua\": [\"xua\"],\n    \"huai\": [\"xuai\"],\n    \"huan\": [\"xuan\"],\n    \"huang\": [\"xuɑŋ\"],\n    \"hui\": [\"xuei\"],\n    \"hun\": [\"xun\"],\n    \"huo\": [\"xuo\"],\n    \"ji\": [\"dʑi\"],\n    \"jia\": [\"dʑia\"],\n    \"jian\": [\"dʑiɛn\"],\n    \"jiang\": [\"dʑiɑŋ\"],\n    \"jiao\": [\"dʑiaʌ\"],\n    \"jie\": [\"dʑie\"],\n    \"jin\": [\"dʑin\"],\n    \"jing\": [\"dʑɨŋ\"],\n    \"jiong\": [\"dʑioŋ\"],\n    \"jiu\": [\"dʑio\"],\n    \"ju\": [\"dʑy\"],\n    \"juan\": [\"dʑyɛn\"],\n    \"jue\": [\"dʑye\"],\n    \"jun\": [\"dʑyn\"],\n    \"ka\": [\"ka\"],\n    \"kai\": [\"kai\"],\n    \"kan\": [\"kan\"],\n    \"kang\": [\"kɑŋ\"],\n    \"kao\": [\"kaʌ\"],\n    \"ke\": [\"kø\"],\n    \"kei\": [\"kei\"],\n    \"ken\": [\"kœn\"],\n    \"keng\": [\"kɵŋ\"],\n    \"kong\": [\"koŋ\"],\n    \"kou\": [\"kou\"],\n    \"ku\": [\"ku\"],\n    \"kua\": [\"kua\"],\n    \"kuai\": [\"kuai\"],\n    \"kuan\": [\"kuan\"],\n    \"kuang\": [\"kuɑŋ\"],\n    \"kui\": [\"kuei\"],\n    \"kun\": [\"kun\"],\n    \"kuo\": [\"kuo\"],\n    \"la\": [\"la\"],\n    \"lai\": [\"lai\"],\n    \"lan\": [\"lan\"],\n    \"lang\": [\"lɑŋ\"],\n    \"lao\": [\"laʌ\"],\n    \"le\": [\"lø\"],\n    \"lei\": [\"lei\"],\n    \"leng\": [\"lɵŋ\"],\n    \"li\": [\"li\"],\n    \"lia\": [\"lia\"],\n    \"lian\": [\"liɛn\"],\n    \"liang\": [\"liɑŋ\"],\n    \"liao\": [\"liaʌ\"],\n    \"lie\": [\"lie\"],\n    \"lin\": [\"lin\"],\n    \"ling\": [\"lɨŋ\"],\n    \"liu\": [\"lio\"],\n    \"lo\": [\"lo\"],\n    \"long\": [\"loŋ\"],\n    \"lou\": [\"lou\"],\n    \"lu\": [\"lu\"],\n    \"lv\": [\"ly\"],\n    \"luan\": [\"luan\"],\n    \"lve\": [\"lye\"],\n    \"lue\": [\"lue\"],\n    \"lun\": [\"lun\"],\n    \"luo\": [\"luo\"],\n    \"ma\": [\"ma\"],\n    \"mai\": [\"mai\"],\n    \"man\": [\"man\"],\n    \"mang\": [\"mɑŋ\"],\n    \"mao\": [\"maʌ\"],\n    \"me\": [\"mø\"],\n    \"mei\": [\"mei\"],\n    \"men\": [\"mœn\"],\n    \"meng\": [\"mɵŋ\"],\n    \"mi\": [\"mi\"],\n    \"mian\": [\"miɛn\"],\n    \"miao\": [\"miaʌ\"],\n    \"mie\": [\"mie\"],\n    \"min\": [\"min\"],\n    \"ming\": [\"mɨŋ\"],\n    \"miu\": [\"mio\"],\n    \"mo\": [\"mo\"],\n    \"mou\": [\"mou\"],\n    \"mu\": [\"mu\"],\n    \"na\": [\"na\"],\n    \"nai\": [\"nai\"],\n    \"nan\": [\"nan\"],\n    \"nang\": [\"nɑŋ\"],\n    \"nao\": [\"naʌ\"],\n    \"ne\": [\"nø\"],\n    \"nei\": [\"nei\"],\n    \"nen\": [\"nœn\"],\n    \"neng\": [\"nɵŋ\"],\n    \"ni\": [\"ni\"],\n    \"nia\": [\"nia\"],\n    \"nian\": [\"niɛn\"],\n    \"niang\": [\"niɑŋ\"],\n    \"niao\": [\"niaʌ\"],\n    \"nie\": [\"nie\"],\n    \"nin\": [\"nin\"],\n    \"ning\": [\"nɨŋ\"],\n    \"niu\": [\"nio\"],\n    \"nong\": [\"noŋ\"],\n    \"nou\": [\"nou\"],\n    \"nu\": [\"nu\"],\n    \"nv\": [\"ny\"],\n    \"nuan\": [\"nuan\"],\n    \"nve\": [\"nye\"],\n    \"nue\": [\"nye\"],\n    \"nuo\": [\"nuo\"],\n    \"o\": [\"o\"],\n    \"ou\": [\"ou\"],\n    \"pa\": [\"pa\"],\n    \"pai\": [\"pai\"],\n    \"pan\": [\"pan\"],\n    \"pang\": [\"pɑŋ\"],\n    \"pao\": [\"paʌ\"],\n    \"pe\": [\"pø\"],\n    \"pei\": [\"pei\"],\n    \"pen\": [\"pœn\"],\n    \"peng\": [\"pɵŋ\"],\n    \"pi\": [\"pi\"],\n    \"pian\": [\"piɛn\"],\n    \"piao\": [\"piaʌ\"],\n    \"pie\": [\"pie\"],\n    \"pin\": [\"pin\"],\n    \"ping\": [\"pɨŋ\"],\n    \"po\": [\"po\"],\n    \"pou\": [\"pou\"],\n    \"pu\": [\"pu\"],\n    \"qi\": [\"tɕi\"],\n    \"qia\": [\"tɕia\"],\n    \"qian\": [\"tɕiɛn\"],\n    \"qiang\": [\"tɕiɑŋ\"],\n    \"qiao\": [\"tɕiaʌ\"],\n    \"qie\": [\"tɕie\"],\n    \"qin\": [\"tɕin\"],\n    \"qing\": [\"tɕɨŋ\"],\n    \"qiong\": [\"tɕioŋ\"],\n    \"qiu\": [\"tɕio\"],\n    \"qu\": [\"tɕy\"],\n    \"quan\": [\"tɕyɛn\"],\n    \"que\": [\"tɕye\"],\n    \"qun\": [\"tɕyn\"],\n    \"ran\": [\"ʐan\"],\n    \"rang\": [\"ʐɑŋ\"],\n    \"rao\": [\"ʐaʌ\"],\n    \"re\": [\"ʐø\"],\n    \"ren\": [\"ʐœn\"],\n    \"reng\": [\"ʐɵŋ\"],\n    \"ri\": [\"ʐʏ\"],\n    \"rong\": [\"ʐoŋ\"],\n    \"rou\": [\"ʐou\"],\n    \"ru\": [\"ʐu\"],\n    \"rua\": [\"ʐua\"],\n    \"ruan\": [\"ʐuan\"],\n    \"rui\": [\"ʐuei\"],\n    \"run\": [\"ʐun\"],\n    \"ruo\": [\"ʐuo\"],\n    \"sa\": [\"sa\"],\n    \"sai\": [\"sai\"],\n    \"san\": [\"san\"],\n    \"sang\": [\"sɑŋ\"],\n    \"sao\": [\"saʌ\"],\n    \"se\": [\"sø\"],\n    \"sen\": [\"sœn\"],\n    \"seng\": [\"sɵŋ\"],\n    \"sha\": [\"ʂa\"],\n    \"shai\": [\"ʂai\"],\n    \"shan\": [\"ʂan\"],\n    \"shang\": [\"ʂɑŋ\"],\n    \"shao\": [\"ʂaʌ\"],\n    \"she\": [\"ʂø\"],\n    \"shei\": [\"ʂei\"],\n    \"shen\": [\"ʂœn\"],\n    \"sheng\": [\"ʂɵŋ\"],\n    \"shi\": [\"ʂʏ\"],\n    \"shou\": [\"ʂou\"],\n    \"shu\": [\"ʂu\"],\n    \"shua\": [\"ʂua\"],\n    \"shuai\": [\"ʂuai\"],\n    \"shuan\": [\"ʂuan\"],\n    \"shuang\": [\"ʂuɑŋ\"],\n    \"shui\": [\"ʂuei\"],\n    \"shun\": [\"ʂun\"],\n    \"shuo\": [\"ʂuo\"],\n    \"si\": [\"sɪ\"],\n    \"song\": [\"soŋ\"],\n    \"sou\": [\"sou\"],\n    \"su\": [\"su\"],\n    \"suan\": [\"suan\"],\n    \"sui\": [\"suei\"],\n    \"sun\": [\"sun\"],\n    \"suo\": [\"suo\"],\n    \"ta\": [\"ta\"],\n    \"tai\": [\"tai\"],\n    \"tan\": [\"tan\"],\n    \"tang\": [\"tɑŋ\"],\n    \"tao\": [\"taʌ\"],\n    \"te\": [\"tø\"],\n    \"tei\": [\"tei\"],\n    \"teng\": [\"tɵŋ\"],\n    \"ti\": [\"ti\"],\n    \"tian\": [\"tiɛn\"],\n    \"tiao\": [\"tiaʌ\"],\n    \"tie\": [\"tie\"],\n    \"ting\": [\"tɨŋ\"],\n    \"tong\": [\"toŋ\"],\n    \"tou\": [\"tou\"],\n    \"tu\": [\"tu\"],\n    \"tuan\": [\"tuan\"],\n    \"tui\": [\"tuei\"],\n    \"tun\": [\"tun\"],\n    \"tuo\": [\"tuo\"],\n    \"wa\": [\"wa\"],\n    \"wai\": [\"wai\"],\n    \"wan\": [\"wan\"],\n    \"wang\": [\"wɑŋ\"],\n    \"wei\": [\"wei\"],\n    \"wen\": [\"wœn\"],\n    \"weng\": [\"wɵŋ\"],\n    \"wo\": [\"wo\"],\n    \"wu\": [\"wu\"],\n    \"xi\": [\"ɕi\"],\n    \"xia\": [\"ɕia\"],\n    \"xian\": [\"ɕiɛn\"],\n    \"xiang\": [\"ɕiɑŋ\"],\n    \"xiao\": [\"ɕiaʌ\"],\n    \"xie\": [\"ɕie\"],\n    \"xin\": [\"ɕin\"],\n    \"xing\": [\"ɕɨŋ\"],\n    \"xiong\": [\"ɕioŋ\"],\n    \"xiu\": [\"ɕio\"],\n    \"xu\": [\"ɕy\"],\n    \"xuan\": [\"ɕyɛn\"],\n    \"xue\": [\"ɕye\"],\n    \"xun\": [\"ɕyn\"],\n    \"ya\": [\"ia\"],\n    \"yan\": [\"iɛn\"],\n    \"yang\": [\"iɑŋ\"],\n    \"yao\": [\"iaʌ\"],\n    \"ye\": [\"ie\"],\n    \"yi\": [\"i\"],\n    \"yin\": [\"in\"],\n    \"ying\": [\"ɨŋ\"],\n    \"yo\": [\"io\"],\n    \"yong\": [\"ioŋ\"],\n    \"you\": [\"io\"],\n    \"yu\": [\"y\"],\n    \"yuan\": [\"yɛn\"],\n    \"yue\": [\"ye\"],\n    \"yun\": [\"yn\"],\n    \"za\": [\"dza\"],\n    \"zai\": [\"dzai\"],\n    \"zan\": [\"dzan\"],\n    \"zang\": [\"dzɑŋ\"],\n    \"zao\": [\"dzaʌ\"],\n    \"ze\": [\"dzø\"],\n    \"zei\": [\"dzei\"],\n    \"zen\": [\"dzœn\"],\n    \"zeng\": [\"dzɵŋ\"],\n    \"zha\": [\"dʒa\"],\n    \"zhai\": [\"dʒai\"],\n    \"zhan\": [\"dʒan\"],\n    \"zhang\": [\"dʒɑŋ\"],\n    \"zhao\": [\"dʒaʌ\"],\n    \"zhe\": [\"dʒø\"],\n    # \"zhei\": [\"dʒei\"], it doesn't exist\n    \"zhen\": [\"dʒœn\"],\n    \"zheng\": [\"dʒɵŋ\"],\n    \"zhi\": [\"dʒʏ\"],\n    \"zhong\": [\"dʒoŋ\"],\n    \"zhou\": [\"dʒou\"],\n    \"zhu\": [\"dʒu\"],\n    \"zhua\": [\"dʒua\"],\n    \"zhuai\": [\"dʒuai\"],\n    \"zhuan\": [\"dʒuan\"],\n    \"zhuang\": [\"dʒuɑŋ\"],\n    \"zhui\": [\"dʒuei\"],\n    \"zhun\": [\"dʒun\"],\n    \"zhuo\": [\"dʒuo\"],\n    \"zi\": [\"dzɪ\"],\n    \"zong\": [\"dzoŋ\"],\n    \"zou\": [\"dzou\"],\n    \"zu\": [\"dzu\"],\n    \"zuan\": [\"dzuan\"],\n    \"zui\": [\"dzuei\"],\n    \"zun\": [\"dzun\"],\n    \"zuo\": [\"dzuo\"],\n}\n"
  },
  {
    "path": "TTS/tts/utils/text/cleaners.py",
    "content": "\"\"\"Set of default text cleaners\"\"\"\n# TODO: pick the cleaner for languages dynamically\n\nimport re\n\nfrom anyascii import anyascii\n\nfrom TTS.tts.utils.text.chinese_mandarin.numbers import replace_numbers_to_characters_in_text\n\nfrom .english.abbreviations import abbreviations_en\nfrom .english.number_norm import normalize_numbers as en_normalize_numbers\nfrom .english.time_norm import expand_time_english\nfrom .french.abbreviations import abbreviations_fr\n\n# Regular expression matching whitespace:\n_whitespace_re = re.compile(r\"\\s+\")\n\n\ndef expand_abbreviations(text, lang=\"en\"):\n    if lang == \"en\":\n        _abbreviations = abbreviations_en\n    elif lang == \"fr\":\n        _abbreviations = abbreviations_fr\n    for regex, replacement in _abbreviations:\n        text = re.sub(regex, replacement, text)\n    return text\n\n\ndef lowercase(text):\n    return text.lower()\n\n\ndef collapse_whitespace(text):\n    return re.sub(_whitespace_re, \" \", text).strip()\n\n\ndef convert_to_ascii(text):\n    return anyascii(text)\n\n\ndef remove_aux_symbols(text):\n    text = re.sub(r\"[\\<\\>\\(\\)\\[\\]\\\"]+\", \"\", text)\n    return text\n\n\ndef replace_symbols(text, lang=\"en\"):\n    \"\"\"Replace symbols based on the lenguage tag.\n\n    Args:\n      text:\n       Input text.\n      lang:\n        Lenguage identifier. ex: \"en\", \"fr\", \"pt\", \"ca\".\n\n    Returns:\n      The modified text\n      example:\n        input args:\n            text: \"si l'avi cau, diguem-ho\"\n            lang: \"ca\"\n        Output:\n            text: \"si lavi cau, diguemho\"\n    \"\"\"\n    text = text.replace(\";\", \",\")\n    text = text.replace(\"-\", \" \") if lang != \"ca\" else text.replace(\"-\", \"\")\n    text = text.replace(\":\", \",\")\n    if lang == \"en\":\n        text = text.replace(\"&\", \" and \")\n    elif lang == \"fr\":\n        text = text.replace(\"&\", \" et \")\n    elif lang == \"pt\":\n        text = text.replace(\"&\", \" e \")\n    elif lang == \"ca\":\n        text = text.replace(\"&\", \" i \")\n        text = text.replace(\"'\", \"\")\n    return text\n\n\ndef basic_cleaners(text):\n    \"\"\"Basic pipeline that lowercases and collapses whitespace without transliteration.\"\"\"\n    text = lowercase(text)\n    text = collapse_whitespace(text)\n    return text\n\n\ndef transliteration_cleaners(text):\n    \"\"\"Pipeline for non-English text that transliterates to ASCII.\"\"\"\n    # text = convert_to_ascii(text)\n    text = lowercase(text)\n    text = collapse_whitespace(text)\n    return text\n\n\ndef basic_german_cleaners(text):\n    \"\"\"Pipeline for German text\"\"\"\n    text = lowercase(text)\n    text = collapse_whitespace(text)\n    return text\n\n\n# TODO: elaborate it\ndef basic_turkish_cleaners(text):\n    \"\"\"Pipeline for Turkish text\"\"\"\n    text = text.replace(\"I\", \"ı\")\n    text = lowercase(text)\n    text = collapse_whitespace(text)\n    return text\n\n\ndef english_cleaners(text):\n    \"\"\"Pipeline for English text, including number and abbreviation expansion.\"\"\"\n    # text = convert_to_ascii(text)\n    text = lowercase(text)\n    text = expand_time_english(text)\n    text = en_normalize_numbers(text)\n    text = expand_abbreviations(text)\n    text = replace_symbols(text)\n    text = remove_aux_symbols(text)\n    text = collapse_whitespace(text)\n    return text\n\n\ndef phoneme_cleaners(text):\n    \"\"\"Pipeline for phonemes mode, including number and abbreviation expansion.\"\"\"\n    text = en_normalize_numbers(text)\n    text = expand_abbreviations(text)\n    text = replace_symbols(text)\n    text = remove_aux_symbols(text)\n    text = collapse_whitespace(text)\n    return text\n\n\ndef french_cleaners(text):\n    \"\"\"Pipeline for French text. There is no need to expand numbers, phonemizer already does that\"\"\"\n    text = expand_abbreviations(text, lang=\"fr\")\n    text = lowercase(text)\n    text = replace_symbols(text, lang=\"fr\")\n    text = remove_aux_symbols(text)\n    text = collapse_whitespace(text)\n    return text\n\n\ndef portuguese_cleaners(text):\n    \"\"\"Basic pipeline for Portuguese text. There is no need to expand abbreviation and\n    numbers, phonemizer already does that\"\"\"\n    text = lowercase(text)\n    text = replace_symbols(text, lang=\"pt\")\n    text = remove_aux_symbols(text)\n    text = collapse_whitespace(text)\n    return text\n\n\ndef chinese_mandarin_cleaners(text: str) -> str:\n    \"\"\"Basic pipeline for chinese\"\"\"\n    text = replace_numbers_to_characters_in_text(text)\n    return text\n\n\ndef multilingual_cleaners(text):\n    \"\"\"Pipeline for multilingual text\"\"\"\n    text = lowercase(text)\n    text = replace_symbols(text, lang=None)\n    text = remove_aux_symbols(text)\n    text = collapse_whitespace(text)\n    return text\n"
  },
  {
    "path": "TTS/tts/utils/text/cmudict.py",
    "content": "# -*- coding: utf-8 -*-\n\nimport re\n\nVALID_SYMBOLS = [\n    \"AA\",\n    \"AA0\",\n    \"AA1\",\n    \"AA2\",\n    \"AE\",\n    \"AE0\",\n    \"AE1\",\n    \"AE2\",\n    \"AH\",\n    \"AH0\",\n    \"AH1\",\n    \"AH2\",\n    \"AO\",\n    \"AO0\",\n    \"AO1\",\n    \"AO2\",\n    \"AW\",\n    \"AW0\",\n    \"AW1\",\n    \"AW2\",\n    \"AY\",\n    \"AY0\",\n    \"AY1\",\n    \"AY2\",\n    \"B\",\n    \"CH\",\n    \"D\",\n    \"DH\",\n    \"EH\",\n    \"EH0\",\n    \"EH1\",\n    \"EH2\",\n    \"ER\",\n    \"ER0\",\n    \"ER1\",\n    \"ER2\",\n    \"EY\",\n    \"EY0\",\n    \"EY1\",\n    \"EY2\",\n    \"F\",\n    \"G\",\n    \"HH\",\n    \"IH\",\n    \"IH0\",\n    \"IH1\",\n    \"IH2\",\n    \"IY\",\n    \"IY0\",\n    \"IY1\",\n    \"IY2\",\n    \"JH\",\n    \"K\",\n    \"L\",\n    \"M\",\n    \"N\",\n    \"NG\",\n    \"OW\",\n    \"OW0\",\n    \"OW1\",\n    \"OW2\",\n    \"OY\",\n    \"OY0\",\n    \"OY1\",\n    \"OY2\",\n    \"P\",\n    \"R\",\n    \"S\",\n    \"SH\",\n    \"T\",\n    \"TH\",\n    \"UH\",\n    \"UH0\",\n    \"UH1\",\n    \"UH2\",\n    \"UW\",\n    \"UW0\",\n    \"UW1\",\n    \"UW2\",\n    \"V\",\n    \"W\",\n    \"Y\",\n    \"Z\",\n    \"ZH\",\n]\n\n\nclass CMUDict:\n    \"\"\"Thin wrapper around CMUDict data. http://www.speech.cs.cmu.edu/cgi-bin/cmudict\"\"\"\n\n    def __init__(self, file_or_path, keep_ambiguous=True):\n        if isinstance(file_or_path, str):\n            with open(file_or_path, encoding=\"latin-1\") as f:\n                entries = _parse_cmudict(f)\n        else:\n            entries = _parse_cmudict(file_or_path)\n        if not keep_ambiguous:\n            entries = {word: pron for word, pron in entries.items() if len(pron) == 1}\n        self._entries = entries\n\n    def __len__(self):\n        return len(self._entries)\n\n    def lookup(self, word):\n        \"\"\"Returns list of ARPAbet pronunciations of the given word.\"\"\"\n        return self._entries.get(word.upper())\n\n    @staticmethod\n    def get_arpabet(word, cmudict, punctuation_symbols):\n        first_symbol, last_symbol = \"\", \"\"\n        if word and word[0] in punctuation_symbols:\n            first_symbol = word[0]\n            word = word[1:]\n        if word and word[-1] in punctuation_symbols:\n            last_symbol = word[-1]\n            word = word[:-1]\n        arpabet = cmudict.lookup(word)\n        if arpabet is not None:\n            return first_symbol + \"{%s}\" % arpabet[0] + last_symbol\n        return first_symbol + word + last_symbol\n\n\n_alt_re = re.compile(r\"\\([0-9]+\\)\")\n\n\ndef _parse_cmudict(file):\n    cmudict = {}\n    for line in file:\n        if line and (line[0] >= \"A\" and line[0] <= \"Z\" or line[0] == \"'\"):\n            parts = line.split(\"  \")\n            word = re.sub(_alt_re, \"\", parts[0])\n            pronunciation = _get_pronunciation(parts[1])\n            if pronunciation:\n                if word in cmudict:\n                    cmudict[word].append(pronunciation)\n                else:\n                    cmudict[word] = [pronunciation]\n    return cmudict\n\n\ndef _get_pronunciation(s):\n    parts = s.strip().split(\" \")\n    for part in parts:\n        if part not in VALID_SYMBOLS:\n            return None\n    return \" \".join(parts)\n"
  },
  {
    "path": "TTS/tts/utils/text/english/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/utils/text/english/abbreviations.py",
    "content": "import re\n\n# List of (regular expression, replacement) pairs for abbreviations in english:\nabbreviations_en = [\n    (re.compile(\"\\\\b%s\\\\.\" % x[0], re.IGNORECASE), x[1])\n    for x in [\n        (\"mrs\", \"misess\"),\n        (\"mr\", \"mister\"),\n        (\"dr\", \"doctor\"),\n        (\"st\", \"saint\"),\n        (\"co\", \"company\"),\n        (\"jr\", \"junior\"),\n        (\"maj\", \"major\"),\n        (\"gen\", \"general\"),\n        (\"drs\", \"doctors\"),\n        (\"rev\", \"reverend\"),\n        (\"lt\", \"lieutenant\"),\n        (\"hon\", \"honorable\"),\n        (\"sgt\", \"sergeant\"),\n        (\"capt\", \"captain\"),\n        (\"esq\", \"esquire\"),\n        (\"ltd\", \"limited\"),\n        (\"col\", \"colonel\"),\n        (\"ft\", \"fort\"),\n    ]\n]\n"
  },
  {
    "path": "TTS/tts/utils/text/english/number_norm.py",
    "content": "\"\"\" from https://github.com/keithito/tacotron \"\"\"\n\nimport re\nfrom typing import Dict\n\nimport inflect\n\n_inflect = inflect.engine()\n_comma_number_re = re.compile(r\"([0-9][0-9\\,]+[0-9])\")\n_decimal_number_re = re.compile(r\"([0-9]+\\.[0-9]+)\")\n_currency_re = re.compile(r\"(£|\\$|¥)([0-9\\,\\.]*[0-9]+)\")\n_ordinal_re = re.compile(r\"[0-9]+(st|nd|rd|th)\")\n_number_re = re.compile(r\"-?[0-9]+\")\n\n\ndef _remove_commas(m):\n    return m.group(1).replace(\",\", \"\")\n\n\ndef _expand_decimal_point(m):\n    return m.group(1).replace(\".\", \" point \")\n\n\ndef __expand_currency(value: str, inflection: Dict[float, str]) -> str:\n    parts = value.replace(\",\", \"\").split(\".\")\n    if len(parts) > 2:\n        return f\"{value} {inflection[2]}\"  # Unexpected format\n    text = []\n    integer = int(parts[0]) if parts[0] else 0\n    if integer > 0:\n        integer_unit = inflection.get(integer, inflection[2])\n        text.append(f\"{integer} {integer_unit}\")\n    fraction = int(parts[1]) if len(parts) > 1 and parts[1] else 0\n    if fraction > 0:\n        fraction_unit = inflection.get(fraction / 100, inflection[0.02])\n        text.append(f\"{fraction} {fraction_unit}\")\n    if len(text) == 0:\n        return f\"zero {inflection[2]}\"\n    return \" \".join(text)\n\n\ndef _expand_currency(m: \"re.Match\") -> str:\n    currencies = {\n        \"$\": {\n            0.01: \"cent\",\n            0.02: \"cents\",\n            1: \"dollar\",\n            2: \"dollars\",\n        },\n        \"€\": {\n            0.01: \"cent\",\n            0.02: \"cents\",\n            1: \"euro\",\n            2: \"euros\",\n        },\n        \"£\": {\n            0.01: \"penny\",\n            0.02: \"pence\",\n            1: \"pound sterling\",\n            2: \"pounds sterling\",\n        },\n        \"¥\": {\n            # TODO rin\n            0.02: \"sen\",\n            2: \"yen\",\n        },\n    }\n    unit = m.group(1)\n    currency = currencies[unit]\n    value = m.group(2)\n    return __expand_currency(value, currency)\n\n\ndef _expand_ordinal(m):\n    return _inflect.number_to_words(m.group(0))\n\n\ndef _expand_number(m):\n    num = int(m.group(0))\n    if 1000 < num < 3000:\n        if num == 2000:\n            return \"two thousand\"\n        if 2000 < num < 2010:\n            return \"two thousand \" + _inflect.number_to_words(num % 100)\n        if num % 100 == 0:\n            return _inflect.number_to_words(num // 100) + \" hundred\"\n        return _inflect.number_to_words(num, andword=\"\", zero=\"oh\", group=2).replace(\", \", \" \")\n    return _inflect.number_to_words(num, andword=\"\")\n\n\ndef normalize_numbers(text):\n    text = re.sub(_comma_number_re, _remove_commas, text)\n    text = re.sub(_currency_re, _expand_currency, text)\n    text = re.sub(_decimal_number_re, _expand_decimal_point, text)\n    text = re.sub(_ordinal_re, _expand_ordinal, text)\n    text = re.sub(_number_re, _expand_number, text)\n    return text\n"
  },
  {
    "path": "TTS/tts/utils/text/english/time_norm.py",
    "content": "import re\n\nimport inflect\n\n_inflect = inflect.engine()\n\n_time_re = re.compile(\n    r\"\"\"\\b\n                          ((0?[0-9])|(1[0-1])|(1[2-9])|(2[0-3]))  # hours\n                          :\n                          ([0-5][0-9])                            # minutes\n                          \\s*(a\\\\.m\\\\.|am|pm|p\\\\.m\\\\.|a\\\\.m|p\\\\.m)? # am/pm\n                          \\b\"\"\",\n    re.IGNORECASE | re.X,\n)\n\n\ndef _expand_num(n: int) -> str:\n    return _inflect.number_to_words(n)\n\n\ndef _expand_time_english(match: \"re.Match\") -> str:\n    hour = int(match.group(1))\n    past_noon = hour >= 12\n    time = []\n    if hour > 12:\n        hour -= 12\n    elif hour == 0:\n        hour = 12\n        past_noon = True\n    time.append(_expand_num(hour))\n\n    minute = int(match.group(6))\n    if minute > 0:\n        if minute < 10:\n            time.append(\"oh\")\n        time.append(_expand_num(minute))\n    am_pm = match.group(7)\n    if am_pm is None:\n        time.append(\"p m\" if past_noon else \"a m\")\n    else:\n        time.extend(list(am_pm.replace(\".\", \"\")))\n    return \" \".join(time)\n\n\ndef expand_time_english(text: str) -> str:\n    return re.sub(_time_re, _expand_time_english, text)\n"
  },
  {
    "path": "TTS/tts/utils/text/french/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/utils/text/french/abbreviations.py",
    "content": "import re\n\n# List of (regular expression, replacement) pairs for abbreviations in french:\nabbreviations_fr = [\n    (re.compile(\"\\\\b%s\\\\.\" % x[0], re.IGNORECASE), x[1])\n    for x in [\n        (\"M\", \"monsieur\"),\n        (\"Mlle\", \"mademoiselle\"),\n        (\"Mlles\", \"mesdemoiselles\"),\n        (\"Mme\", \"Madame\"),\n        (\"Mmes\", \"Mesdames\"),\n        (\"N.B\", \"nota bene\"),\n        (\"M\", \"monsieur\"),\n        (\"p.c.q\", \"parce que\"),\n        (\"Pr\", \"professeur\"),\n        (\"qqch\", \"quelque chose\"),\n        (\"rdv\", \"rendez-vous\"),\n        (\"max\", \"maximum\"),\n        (\"min\", \"minimum\"),\n        (\"no\", \"numéro\"),\n        (\"adr\", \"adresse\"),\n        (\"dr\", \"docteur\"),\n        (\"st\", \"saint\"),\n        (\"co\", \"companie\"),\n        (\"jr\", \"junior\"),\n        (\"sgt\", \"sergent\"),\n        (\"capt\", \"capitain\"),\n        (\"col\", \"colonel\"),\n        (\"av\", \"avenue\"),\n        (\"av. J.-C\", \"avant Jésus-Christ\"),\n        (\"apr. J.-C\", \"après Jésus-Christ\"),\n        (\"art\", \"article\"),\n        (\"boul\", \"boulevard\"),\n        (\"c.-à-d\", \"c’est-à-dire\"),\n        (\"etc\", \"et cetera\"),\n        (\"ex\", \"exemple\"),\n        (\"excl\", \"exclusivement\"),\n        (\"boul\", \"boulevard\"),\n    ]\n] + [\n    (re.compile(\"\\\\b%s\" % x[0]), x[1])\n    for x in [\n        (\"Mlle\", \"mademoiselle\"),\n        (\"Mlles\", \"mesdemoiselles\"),\n        (\"Mme\", \"Madame\"),\n        (\"Mmes\", \"Mesdames\"),\n    ]\n]\n"
  },
  {
    "path": "TTS/tts/utils/text/japanese/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/utils/text/japanese/phonemizer.py",
    "content": "# Convert Japanese text to phonemes which is\n# compatible with Julius https://github.com/julius-speech/segmentation-kit\n\nimport re\nimport unicodedata\n\nimport MeCab\nfrom num2words import num2words\n\n_CONVRULES = [\n    # Conversion of 2 letters\n    \"アァ/ a a\",\n    \"イィ/ i i\",\n    \"イェ/ i e\",\n    \"イャ/ y a\",\n    \"ウゥ/ u:\",\n    \"エェ/ e e\",\n    \"オォ/ o:\",\n    \"カァ/ k a:\",\n    \"キィ/ k i:\",\n    \"クゥ/ k u:\",\n    \"クャ/ ky a\",\n    \"クュ/ ky u\",\n    \"クョ/ ky o\",\n    \"ケェ/ k e:\",\n    \"コォ/ k o:\",\n    \"ガァ/ g a:\",\n    \"ギィ/ g i:\",\n    \"グゥ/ g u:\",\n    \"グャ/ gy a\",\n    \"グュ/ gy u\",\n    \"グョ/ gy o\",\n    \"ゲェ/ g e:\",\n    \"ゴォ/ g o:\",\n    \"サァ/ s a:\",\n    \"シィ/ sh i:\",\n    \"スゥ/ s u:\",\n    \"スャ/ sh a\",\n    \"スュ/ sh u\",\n    \"スョ/ sh o\",\n    \"セェ/ s e:\",\n    \"ソォ/ s o:\",\n    \"ザァ/ z a:\",\n    \"ジィ/ j i:\",\n    \"ズゥ/ z u:\",\n    \"ズャ/ zy a\",\n    \"ズュ/ zy u\",\n    \"ズョ/ zy o\",\n    \"ゼェ/ z e:\",\n    \"ゾォ/ z o:\",\n    \"タァ/ t a:\",\n    \"チィ/ ch i:\",\n    \"ツァ/ ts a\",\n    \"ツィ/ ts i\",\n    \"ツゥ/ ts u:\",\n    \"ツャ/ ch a\",\n    \"ツュ/ ch u\",\n    \"ツョ/ ch o\",\n    \"ツェ/ ts e\",\n    \"ツォ/ ts o\",\n    \"テェ/ t e:\",\n    \"トォ/ t o:\",\n    \"ダァ/ d a:\",\n    \"ヂィ/ j i:\",\n    \"ヅゥ/ d u:\",\n    \"ヅャ/ zy a\",\n    \"ヅュ/ zy u\",\n    \"ヅョ/ zy o\",\n    \"デェ/ d e:\",\n    \"ドォ/ d o:\",\n    \"ナァ/ n a:\",\n    \"ニィ/ n i:\",\n    \"ヌゥ/ n u:\",\n    \"ヌャ/ ny a\",\n    \"ヌュ/ ny u\",\n    \"ヌョ/ ny o\",\n    \"ネェ/ n e:\",\n    \"ノォ/ n o:\",\n    \"ハァ/ h a:\",\n    \"ヒィ/ h i:\",\n    \"フゥ/ f u:\",\n    \"フャ/ hy a\",\n    \"フュ/ hy u\",\n    \"フョ/ hy o\",\n    \"ヘェ/ h e:\",\n    \"ホォ/ h o:\",\n    \"バァ/ b a:\",\n    \"ビィ/ b i:\",\n    \"ブゥ/ b u:\",\n    \"フャ/ hy a\",\n    \"ブュ/ by u\",\n    \"フョ/ hy o\",\n    \"ベェ/ b e:\",\n    \"ボォ/ b o:\",\n    \"パァ/ p a:\",\n    \"ピィ/ p i:\",\n    \"プゥ/ p u:\",\n    \"プャ/ py a\",\n    \"プュ/ py u\",\n    \"プョ/ py o\",\n    \"ペェ/ p e:\",\n    \"ポォ/ p o:\",\n    \"マァ/ m a:\",\n    \"ミィ/ m i:\",\n    \"ムゥ/ m u:\",\n    \"ムャ/ my a\",\n    \"ムュ/ my u\",\n    \"ムョ/ my o\",\n    \"メェ/ m e:\",\n    \"モォ/ m o:\",\n    \"ヤァ/ y a:\",\n    \"ユゥ/ y u:\",\n    \"ユャ/ y a:\",\n    \"ユュ/ y u:\",\n    \"ユョ/ y o:\",\n    \"ヨォ/ y o:\",\n    \"ラァ/ r a:\",\n    \"リィ/ r i:\",\n    \"ルゥ/ r u:\",\n    \"ルャ/ ry a\",\n    \"ルュ/ ry u\",\n    \"ルョ/ ry o\",\n    \"レェ/ r e:\",\n    \"ロォ/ r o:\",\n    \"ワァ/ w a:\",\n    \"ヲォ/ o:\",\n    \"ディ/ d i\",\n    \"デェ/ d e:\",\n    \"デャ/ dy a\",\n    \"デュ/ dy u\",\n    \"デョ/ dy o\",\n    \"ティ/ t i\",\n    \"テェ/ t e:\",\n    \"テャ/ ty a\",\n    \"テュ/ ty u\",\n    \"テョ/ ty o\",\n    \"スィ/ s i\",\n    \"ズァ/ z u a\",\n    \"ズィ/ z i\",\n    \"ズゥ/ z u\",\n    \"ズャ/ zy a\",\n    \"ズュ/ zy u\",\n    \"ズョ/ zy o\",\n    \"ズェ/ z e\",\n    \"ズォ/ z o\",\n    \"キャ/ ky a\",\n    \"キュ/ ky u\",\n    \"キョ/ ky o\",\n    \"シャ/ sh a\",\n    \"シュ/ sh u\",\n    \"シェ/ sh e\",\n    \"ショ/ sh o\",\n    \"チャ/ ch a\",\n    \"チュ/ ch u\",\n    \"チェ/ ch e\",\n    \"チョ/ ch o\",\n    \"トゥ/ t u\",\n    \"トャ/ ty a\",\n    \"トュ/ ty u\",\n    \"トョ/ ty o\",\n    \"ドァ/ d o a\",\n    \"ドゥ/ d u\",\n    \"ドャ/ dy a\",\n    \"ドュ/ dy u\",\n    \"ドョ/ dy o\",\n    \"ドォ/ d o:\",\n    \"ニャ/ ny a\",\n    \"ニュ/ ny u\",\n    \"ニョ/ ny o\",\n    \"ヒャ/ hy a\",\n    \"ヒュ/ hy u\",\n    \"ヒョ/ hy o\",\n    \"ミャ/ my a\",\n    \"ミュ/ my u\",\n    \"ミョ/ my o\",\n    \"リャ/ ry a\",\n    \"リュ/ ry u\",\n    \"リョ/ ry o\",\n    \"ギャ/ gy a\",\n    \"ギュ/ gy u\",\n    \"ギョ/ gy o\",\n    \"ヂェ/ j e\",\n    \"ヂャ/ j a\",\n    \"ヂュ/ j u\",\n    \"ヂョ/ j o\",\n    \"ジェ/ j e\",\n    \"ジャ/ j a\",\n    \"ジュ/ j u\",\n    \"ジョ/ j o\",\n    \"ビャ/ by a\",\n    \"ビュ/ by u\",\n    \"ビョ/ by o\",\n    \"ピャ/ py a\",\n    \"ピュ/ py u\",\n    \"ピョ/ py o\",\n    \"ウァ/ u a\",\n    \"ウィ/ w i\",\n    \"ウェ/ w e\",\n    \"ウォ/ w o\",\n    \"ファ/ f a\",\n    \"フィ/ f i\",\n    \"フゥ/ f u\",\n    \"フャ/ hy a\",\n    \"フュ/ hy u\",\n    \"フョ/ hy o\",\n    \"フェ/ f e\",\n    \"フォ/ f o\",\n    \"ヴァ/ b a\",\n    \"ヴィ/ b i\",\n    \"ヴェ/ b e\",\n    \"ヴォ/ b o\",\n    \"ヴュ/ by u\",\n    # Conversion of 1 letter\n    \"ア/ a\",\n    \"イ/ i\",\n    \"ウ/ u\",\n    \"エ/ e\",\n    \"オ/ o\",\n    \"カ/ k a\",\n    \"キ/ k i\",\n    \"ク/ k u\",\n    \"ケ/ k e\",\n    \"コ/ k o\",\n    \"サ/ s a\",\n    \"シ/ sh i\",\n    \"ス/ s u\",\n    \"セ/ s e\",\n    \"ソ/ s o\",\n    \"タ/ t a\",\n    \"チ/ ch i\",\n    \"ツ/ ts u\",\n    \"テ/ t e\",\n    \"ト/ t o\",\n    \"ナ/ n a\",\n    \"ニ/ n i\",\n    \"ヌ/ n u\",\n    \"ネ/ n e\",\n    \"ノ/ n o\",\n    \"ハ/ h a\",\n    \"ヒ/ h i\",\n    \"フ/ f u\",\n    \"ヘ/ h e\",\n    \"ホ/ h o\",\n    \"マ/ m a\",\n    \"ミ/ m i\",\n    \"ム/ m u\",\n    \"メ/ m e\",\n    \"モ/ m o\",\n    \"ラ/ r a\",\n    \"リ/ r i\",\n    \"ル/ r u\",\n    \"レ/ r e\",\n    \"ロ/ r o\",\n    \"ガ/ g a\",\n    \"ギ/ g i\",\n    \"グ/ g u\",\n    \"ゲ/ g e\",\n    \"ゴ/ g o\",\n    \"ザ/ z a\",\n    \"ジ/ j i\",\n    \"ズ/ z u\",\n    \"ゼ/ z e\",\n    \"ゾ/ z o\",\n    \"ダ/ d a\",\n    \"ヂ/ j i\",\n    \"ヅ/ z u\",\n    \"デ/ d e\",\n    \"ド/ d o\",\n    \"バ/ b a\",\n    \"ビ/ b i\",\n    \"ブ/ b u\",\n    \"ベ/ b e\",\n    \"ボ/ b o\",\n    \"パ/ p a\",\n    \"ピ/ p i\",\n    \"プ/ p u\",\n    \"ペ/ p e\",\n    \"ポ/ p o\",\n    \"ヤ/ y a\",\n    \"ユ/ y u\",\n    \"ヨ/ y o\",\n    \"ワ/ w a\",\n    \"ヰ/ i\",\n    \"ヱ/ e\",\n    \"ヲ/ o\",\n    \"ン/ N\",\n    \"ッ/ q\",\n    \"ヴ/ b u\",\n    \"ー/:\",\n    # Try converting broken text\n    \"ァ/ a\",\n    \"ィ/ i\",\n    \"ゥ/ u\",\n    \"ェ/ e\",\n    \"ォ/ o\",\n    \"ヮ/ w a\",\n    \"ォ/ o\",\n    # Symbols\n    \"、/ ,\",\n    \"。/ .\",\n    \"！/ !\",\n    \"？/ ?\",\n    \"・/ ,\",\n]\n\n_COLON_RX = re.compile(\":+\")\n_REJECT_RX = re.compile(\"[^ a-zA-Z:,.?]\")\n\n\ndef _makerulemap():\n    l = [tuple(x.split(\"/\")) for x in _CONVRULES]\n    return tuple({k: v for k, v in l if len(k) == i} for i in (1, 2))\n\n\n_RULEMAP1, _RULEMAP2 = _makerulemap()\n\n\ndef kata2phoneme(text: str) -> str:\n    \"\"\"Convert katakana text to phonemes.\"\"\"\n    text = text.strip()\n    res = \"\"\n    while text:\n        if len(text) >= 2:\n            x = _RULEMAP2.get(text[:2])\n            if x is not None:\n                text = text[2:]\n                res += x\n                continue\n        x = _RULEMAP1.get(text[0])\n        if x is not None:\n            text = text[1:]\n            res += x\n            continue\n        res += \" \" + text[0]\n        text = text[1:]\n    res = _COLON_RX.sub(\":\", res)\n    return res[1:]\n\n\n_KATAKANA = \"\".join(chr(ch) for ch in range(ord(\"ァ\"), ord(\"ン\") + 1))\n_HIRAGANA = \"\".join(chr(ch) for ch in range(ord(\"ぁ\"), ord(\"ん\") + 1))\n_HIRA2KATATRANS = str.maketrans(_HIRAGANA, _KATAKANA)\n\n\ndef hira2kata(text: str) -> str:\n    text = text.translate(_HIRA2KATATRANS)\n    return text.replace(\"う゛\", \"ヴ\")\n\n\n_SYMBOL_TOKENS = set(list(\"・、。？！\"))\n_NO_YOMI_TOKENS = set(list(\"「」『』―（）［］[]　…\"))\n_TAGGER = MeCab.Tagger()\n\n\ndef text2kata(text: str) -> str:\n    parsed = _TAGGER.parse(text)\n    res = []\n    for line in parsed.split(\"\\n\"):\n        if line == \"EOS\":\n            break\n        parts = line.split(\"\\t\")\n\n        word, yomi = parts[0], parts[1]\n        if yomi:\n            res.append(yomi)\n        else:\n            if word in _SYMBOL_TOKENS:\n                res.append(word)\n            elif word in (\"っ\", \"ッ\"):\n                res.append(\"ッ\")\n            elif word in _NO_YOMI_TOKENS:\n                pass\n            else:\n                res.append(word)\n    return hira2kata(\"\".join(res))\n\n\n_ALPHASYMBOL_YOMI = {\n    \"#\": \"シャープ\",\n    \"%\": \"パーセント\",\n    \"&\": \"アンド\",\n    \"+\": \"プラス\",\n    \"-\": \"マイナス\",\n    \":\": \"コロン\",\n    \";\": \"セミコロン\",\n    \"<\": \"小なり\",\n    \"=\": \"イコール\",\n    \">\": \"大なり\",\n    \"@\": \"アット\",\n    \"a\": \"エー\",\n    \"b\": \"ビー\",\n    \"c\": \"シー\",\n    \"d\": \"ディー\",\n    \"e\": \"イー\",\n    \"f\": \"エフ\",\n    \"g\": \"ジー\",\n    \"h\": \"エイチ\",\n    \"i\": \"アイ\",\n    \"j\": \"ジェー\",\n    \"k\": \"ケー\",\n    \"l\": \"エル\",\n    \"m\": \"エム\",\n    \"n\": \"エヌ\",\n    \"o\": \"オー\",\n    \"p\": \"ピー\",\n    \"q\": \"キュー\",\n    \"r\": \"アール\",\n    \"s\": \"エス\",\n    \"t\": \"ティー\",\n    \"u\": \"ユー\",\n    \"v\": \"ブイ\",\n    \"w\": \"ダブリュー\",\n    \"x\": \"エックス\",\n    \"y\": \"ワイ\",\n    \"z\": \"ゼット\",\n    \"α\": \"アルファ\",\n    \"β\": \"ベータ\",\n    \"γ\": \"ガンマ\",\n    \"δ\": \"デルタ\",\n    \"ε\": \"イプシロン\",\n    \"ζ\": \"ゼータ\",\n    \"η\": \"イータ\",\n    \"θ\": \"シータ\",\n    \"ι\": \"イオタ\",\n    \"κ\": \"カッパ\",\n    \"λ\": \"ラムダ\",\n    \"μ\": \"ミュー\",\n    \"ν\": \"ニュー\",\n    \"ξ\": \"クサイ\",\n    \"ο\": \"オミクロン\",\n    \"π\": \"パイ\",\n    \"ρ\": \"ロー\",\n    \"σ\": \"シグマ\",\n    \"τ\": \"タウ\",\n    \"υ\": \"ウプシロン\",\n    \"φ\": \"ファイ\",\n    \"χ\": \"カイ\",\n    \"ψ\": \"プサイ\",\n    \"ω\": \"オメガ\",\n}\n\n\n_NUMBER_WITH_SEPARATOR_RX = re.compile(\"[0-9]{1,3}(,[0-9]{3})+\")\n_CURRENCY_MAP = {\"$\": \"ドル\", \"¥\": \"円\", \"£\": \"ポンド\", \"€\": \"ユーロ\"}\n_CURRENCY_RX = re.compile(r\"([$¥£€])([0-9.]*[0-9])\")\n_NUMBER_RX = re.compile(r\"[0-9]+(\\.[0-9]+)?\")\n\n\ndef japanese_convert_numbers_to_words(text: str) -> str:\n    res = _NUMBER_WITH_SEPARATOR_RX.sub(lambda m: m[0].replace(\",\", \"\"), text)\n    res = _CURRENCY_RX.sub(lambda m: m[2] + _CURRENCY_MAP.get(m[1], m[1]), res)\n    res = _NUMBER_RX.sub(lambda m: num2words(m[0], lang=\"ja\"), res)\n    return res\n\n\ndef japanese_convert_alpha_symbols_to_words(text: str) -> str:\n    return \"\".join([_ALPHASYMBOL_YOMI.get(ch, ch) for ch in text.lower()])\n\n\ndef japanese_text_to_phonemes(text: str) -> str:\n    \"\"\"Convert Japanese text to phonemes.\"\"\"\n    res = unicodedata.normalize(\"NFKC\", text)\n    res = japanese_convert_numbers_to_words(res)\n    res = japanese_convert_alpha_symbols_to_words(res)\n    res = text2kata(res)\n    res = kata2phoneme(res)\n    return res.replace(\" \", \"\")\n"
  },
  {
    "path": "TTS/tts/utils/text/korean/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/tts/utils/text/korean/ko_dictionary.py",
    "content": "# coding: utf-8\n# Add the word you want to the dictionary.\netc_dictionary = {\"1+1\": \"원플러스원\", \"2+1\": \"투플러스원\"}\n\n\nenglish_dictionary = {\n    \"KOREA\": \"코리아\",\n    \"IDOL\": \"아이돌\",\n    \"IT\": \"아이티\",\n    \"IQ\": \"아이큐\",\n    \"UP\": \"업\",\n    \"DOWN\": \"다운\",\n    \"PC\": \"피씨\",\n    \"CCTV\": \"씨씨티비\",\n    \"SNS\": \"에스엔에스\",\n    \"AI\": \"에이아이\",\n    \"CEO\": \"씨이오\",\n    \"A\": \"에이\",\n    \"B\": \"비\",\n    \"C\": \"씨\",\n    \"D\": \"디\",\n    \"E\": \"이\",\n    \"F\": \"에프\",\n    \"G\": \"지\",\n    \"H\": \"에이치\",\n    \"I\": \"아이\",\n    \"J\": \"제이\",\n    \"K\": \"케이\",\n    \"L\": \"엘\",\n    \"M\": \"엠\",\n    \"N\": \"엔\",\n    \"O\": \"오\",\n    \"P\": \"피\",\n    \"Q\": \"큐\",\n    \"R\": \"알\",\n    \"S\": \"에스\",\n    \"T\": \"티\",\n    \"U\": \"유\",\n    \"V\": \"브이\",\n    \"W\": \"더블유\",\n    \"X\": \"엑스\",\n    \"Y\": \"와이\",\n    \"Z\": \"제트\",\n}\n"
  },
  {
    "path": "TTS/tts/utils/text/korean/korean.py",
    "content": "﻿# coding: utf-8\n# Code based on https://github.com/carpedm20/multi-speaker-tacotron-tensorflow/blob/master/text/korean.py\nimport re\n\nfrom TTS.tts.utils.text.korean.ko_dictionary import english_dictionary, etc_dictionary\n\n\ndef normalize(text):\n    text = text.strip()\n    text = re.sub(\"[⺀-⺙⺛-⻳⼀-⿕々〇〡-〩〸-〺〻㐀-䶵一-鿃豈-鶴侮-頻並-龎]\", \"\", text)\n    text = normalize_with_dictionary(text, etc_dictionary)\n    text = normalize_english(text)\n    text = text.lower()\n    return text\n\n\ndef normalize_with_dictionary(text, dic):\n    if any(key in text for key in dic.keys()):\n        pattern = re.compile(\"|\".join(re.escape(key) for key in dic.keys()))\n        return pattern.sub(lambda x: dic[x.group()], text)\n    return text\n\n\ndef normalize_english(text):\n    def fn(m):\n        word = m.group()\n        if word in english_dictionary:\n            return english_dictionary.get(word)\n        return word\n\n    text = re.sub(\"([A-Za-z]+)\", fn, text)\n    return text\n"
  },
  {
    "path": "TTS/tts/utils/text/korean/phonemizer.py",
    "content": "from jamo import hangul_to_jamo\n\nfrom TTS.tts.utils.text.korean.korean import normalize\n\ng2p = None\n\n\ndef korean_text_to_phonemes(text, character: str = \"hangeul\") -> str:\n    \"\"\"\n\n    The input and output values look the same, but they are different in Unicode.\n\n    example :\n\n        input = '하늘' (Unicode : \\ud558\\ub298), (하 + 늘)\n        output = '하늘' (Unicode :\\u1112\\u1161\\u1102\\u1173\\u11af), (ᄒ + ᅡ + ᄂ + ᅳ + ᆯ)\n\n    \"\"\"\n    global g2p  # pylint: disable=global-statement\n    if g2p is None:\n        from g2pkk import G2p\n\n        g2p = G2p()\n\n    if character == \"english\":\n        from anyascii import anyascii\n\n        text = normalize(text)\n        text = g2p(text)\n        text = anyascii(text)\n        return text\n\n    text = normalize(text)\n    text = g2p(text)\n    text = list(hangul_to_jamo(text))  # '하늘' --> ['ᄒ', 'ᅡ', 'ᄂ', 'ᅳ', 'ᆯ']\n    return \"\".join(text)\n"
  },
  {
    "path": "TTS/tts/utils/text/phonemizers/__init__.py",
    "content": "from TTS.tts.utils.text.phonemizers.base import BasePhonemizer\nfrom TTS.tts.utils.text.phonemizers.espeak_wrapper import ESpeak\nfrom TTS.tts.utils.text.phonemizers.gruut_wrapper import Gruut\nfrom TTS.tts.utils.text.phonemizers.ja_jp_phonemizer import JA_JP_Phonemizer\nfrom TTS.tts.utils.text.phonemizers.ko_kr_phonemizer import KO_KR_Phonemizer\nfrom TTS.tts.utils.text.phonemizers.zh_cn_phonemizer import ZH_CN_Phonemizer\n\nPHONEMIZERS = {b.name(): b for b in (ESpeak, Gruut, JA_JP_Phonemizer)}\n\n\nESPEAK_LANGS = list(ESpeak.supported_languages().keys())\nGRUUT_LANGS = list(Gruut.supported_languages())\n\n\n# Dict setting default phonemizers for each language\n# Add Gruut languages\n_ = [Gruut.name()] * len(GRUUT_LANGS)\nDEF_LANG_TO_PHONEMIZER = dict(list(zip(GRUUT_LANGS, _)))\n\n\n# Add ESpeak languages and override any existing ones\n_ = [ESpeak.name()] * len(ESPEAK_LANGS)\n_new_dict = dict(list(zip(list(ESPEAK_LANGS), _)))\nDEF_LANG_TO_PHONEMIZER.update(_new_dict)\n\n# Force default for some languages\nDEF_LANG_TO_PHONEMIZER[\"en\"] = DEF_LANG_TO_PHONEMIZER[\"en-us\"]\nDEF_LANG_TO_PHONEMIZER[\"ja-jp\"] = JA_JP_Phonemizer.name()\nDEF_LANG_TO_PHONEMIZER[\"zh-cn\"] = ZH_CN_Phonemizer.name()\nDEF_LANG_TO_PHONEMIZER[\"ko-kr\"] = KO_KR_Phonemizer.name()\n\n\ndef get_phonemizer_by_name(name: str, **kwargs) -> BasePhonemizer:\n    \"\"\"Initiate a phonemizer by name\n\n    Args:\n        name (str):\n            Name of the phonemizer that should match `phonemizer.name()`.\n\n        kwargs (dict):\n            Extra keyword arguments that should be passed to the phonemizer.\n    \"\"\"\n    if name == \"espeak\":\n        return ESpeak(**kwargs)\n    if name == \"gruut\":\n        return Gruut(**kwargs)\n    if name == \"zh_cn_phonemizer\":\n        return ZH_CN_Phonemizer(**kwargs)\n    if name == \"ja_jp_phonemizer\":\n        return JA_JP_Phonemizer(**kwargs)\n    if name == \"ko_kr_phonemizer\":\n        return KO_KR_Phonemizer(**kwargs)\n    raise ValueError(f\"Phonemizer {name} not found\")\n\n\nif __name__ == \"__main__\":\n    print(DEF_LANG_TO_PHONEMIZER)\n"
  },
  {
    "path": "TTS/tts/utils/text/phonemizers/base.py",
    "content": "import abc\nfrom typing import List, Tuple\n\nfrom TTS.tts.utils.text.punctuation import Punctuation\n\n\nclass BasePhonemizer(abc.ABC):\n    \"\"\"Base phonemizer class\n\n    Phonemization follows the following steps:\n        1. Preprocessing:\n            - remove empty lines\n            - remove punctuation\n            - keep track of punctuation marks\n\n        2. Phonemization:\n            - convert text to phonemes\n\n        3. Postprocessing:\n            - join phonemes\n            - restore punctuation marks\n\n    Args:\n        language (str):\n            Language used by the phonemizer.\n\n        punctuations (List[str]):\n            List of punctuation marks to be preserved.\n\n        keep_puncs (bool):\n            Whether to preserve punctuation marks or not.\n    \"\"\"\n\n    def __init__(self, language, punctuations=Punctuation.default_puncs(), keep_puncs=False):\n        # ensure the backend is installed on the system\n        if not self.is_available():\n            raise RuntimeError(\"{} not installed on your system\".format(self.name()))  # pragma: nocover\n\n        # ensure the backend support the requested language\n        self._language = self._init_language(language)\n\n        # setup punctuation processing\n        self._keep_puncs = keep_puncs\n        self._punctuator = Punctuation(punctuations)\n\n    def _init_language(self, language):\n        \"\"\"Language initialization\n\n        This method may be overloaded in child classes (see Segments backend)\n\n        \"\"\"\n        if not self.is_supported_language(language):\n            raise RuntimeError(f'language \"{language}\" is not supported by the ' f\"{self.name()} backend\")\n        return language\n\n    @property\n    def language(self):\n        \"\"\"The language code configured to be used for phonemization\"\"\"\n        return self._language\n\n    @staticmethod\n    @abc.abstractmethod\n    def name():\n        \"\"\"The name of the backend\"\"\"\n        ...\n\n    @classmethod\n    @abc.abstractmethod\n    def is_available(cls):\n        \"\"\"Returns True if the backend is installed, False otherwise\"\"\"\n        ...\n\n    @classmethod\n    @abc.abstractmethod\n    def version(cls):\n        \"\"\"Return the backend version as a tuple (major, minor, patch)\"\"\"\n        ...\n\n    @staticmethod\n    @abc.abstractmethod\n    def supported_languages():\n        \"\"\"Return a dict of language codes -> name supported by the backend\"\"\"\n        ...\n\n    def is_supported_language(self, language):\n        \"\"\"Returns True if `language` is supported by the backend\"\"\"\n        return language in self.supported_languages()\n\n    @abc.abstractmethod\n    def _phonemize(self, text, separator):\n        \"\"\"The main phonemization method\"\"\"\n\n    def _phonemize_preprocess(self, text) -> Tuple[List[str], List]:\n        \"\"\"Preprocess the text before phonemization\n\n        1. remove spaces\n        2. remove punctuation\n\n        Override this if you need a different behaviour\n        \"\"\"\n        text = text.strip()\n        if self._keep_puncs:\n            # a tuple (text, punctuation marks)\n            return self._punctuator.strip_to_restore(text)\n        return [self._punctuator.strip(text)], []\n\n    def _phonemize_postprocess(self, phonemized, punctuations) -> str:\n        \"\"\"Postprocess the raw phonemized output\n\n        Override this if you need a different behaviour\n        \"\"\"\n        if self._keep_puncs:\n            return self._punctuator.restore(phonemized, punctuations)[0]\n        return phonemized[0]\n\n    def phonemize(self, text: str, separator=\"|\", language: str = None) -> str:  # pylint: disable=unused-argument\n        \"\"\"Returns the `text` phonemized for the given language\n\n        Args:\n            text (str):\n                Text to be phonemized.\n\n            separator (str):\n                string separator used between phonemes. Default to '_'.\n\n        Returns:\n            (str): Phonemized text\n        \"\"\"\n        text, punctuations = self._phonemize_preprocess(text)\n        phonemized = []\n        for t in text:\n            p = self._phonemize(t, separator)\n            phonemized.append(p)\n        phonemized = self._phonemize_postprocess(phonemized, punctuations)\n        return phonemized\n\n    def print_logs(self, level: int = 0):\n        indent = \"\\t\" * level\n        print(f\"{indent}| > phoneme language: {self.language}\")\n        print(f\"{indent}| > phoneme backend: {self.name()}\")\n"
  },
  {
    "path": "TTS/tts/utils/text/phonemizers/espeak_wrapper.py",
    "content": "import logging\nimport re\nimport subprocess\nfrom typing import Dict, List\n\nfrom packaging.version import Version\n\nfrom TTS.tts.utils.text.phonemizers.base import BasePhonemizer\nfrom TTS.tts.utils.text.punctuation import Punctuation\n\n\ndef is_tool(name):\n    from shutil import which\n\n    return which(name) is not None\n\n\n# Use a regex pattern to match the espeak version, because it may be\n# symlinked to espeak-ng, which moves the version bits to another spot.\nespeak_version_pattern = re.compile(r\"text-to-speech:\\s(?P<version>\\d+\\.\\d+(\\.\\d+)?)\")\n\n\ndef get_espeak_version():\n    output = subprocess.getoutput(\"espeak --version\")\n    match = espeak_version_pattern.search(output)\n\n    return match.group(\"version\")\n\n\ndef get_espeakng_version():\n    output = subprocess.getoutput(\"espeak-ng --version\")\n    return output.split()[3]\n\n\n# priority: espeakng > espeak\nif is_tool(\"espeak-ng\"):\n    _DEF_ESPEAK_LIB = \"espeak-ng\"\n    _DEF_ESPEAK_VER = get_espeakng_version()\nelif is_tool(\"espeak\"):\n    _DEF_ESPEAK_LIB = \"espeak\"\n    _DEF_ESPEAK_VER = get_espeak_version()\nelse:\n    _DEF_ESPEAK_LIB = None\n    _DEF_ESPEAK_VER = None\n\n\ndef _espeak_exe(espeak_lib: str, args: List, sync=False) -> List[str]:\n    \"\"\"Run espeak with the given arguments.\"\"\"\n    cmd = [\n        espeak_lib,\n        \"-q\",\n        \"-b\",\n        \"1\",  # UTF8 text encoding\n    ]\n    cmd.extend(args)\n    logging.debug(\"espeakng: executing %s\", repr(cmd))\n\n    with subprocess.Popen(\n        cmd,\n        stdout=subprocess.PIPE,\n        stderr=subprocess.STDOUT,\n    ) as p:\n        res = iter(p.stdout.readline, b\"\")\n        if not sync:\n            p.stdout.close()\n            if p.stderr:\n                p.stderr.close()\n            if p.stdin:\n                p.stdin.close()\n            return res\n        res2 = []\n        for line in res:\n            res2.append(line)\n        p.stdout.close()\n        if p.stderr:\n            p.stderr.close()\n        if p.stdin:\n            p.stdin.close()\n        p.wait()\n    return res2\n\n\nclass ESpeak(BasePhonemizer):\n    \"\"\"ESpeak wrapper calling `espeak` or `espeak-ng` from the command-line the perform G2P\n\n    Args:\n        language (str):\n            Valid language code for the used backend.\n\n        backend (str):\n            Name of the backend library to use. `espeak` or `espeak-ng`. If None, set automatically\n            prefering `espeak-ng` over `espeak`. Defaults to None.\n\n        punctuations (str):\n            Characters to be treated as punctuation. Defaults to Punctuation.default_puncs().\n\n        keep_puncs (bool):\n            If True, keep the punctuations after phonemization. Defaults to True.\n\n    Example:\n\n        >>> from TTS.tts.utils.text.phonemizers import ESpeak\n        >>> phonemizer = ESpeak(\"tr\")\n        >>> phonemizer.phonemize(\"Bu Türkçe, bir örnektir.\", separator=\"|\")\n        'b|ʊ t|ˈø|r|k|tʃ|ɛ, b|ɪ|r œ|r|n|ˈɛ|c|t|ɪ|r.'\n\n    \"\"\"\n\n    _ESPEAK_LIB = _DEF_ESPEAK_LIB\n    _ESPEAK_VER = _DEF_ESPEAK_VER\n\n    def __init__(self, language: str, backend=None, punctuations=Punctuation.default_puncs(), keep_puncs=True):\n        if self._ESPEAK_LIB is None:\n            raise Exception(\" [!] No espeak backend found. Install espeak-ng or espeak to your system.\")\n        self.backend = self._ESPEAK_LIB\n\n        # band-aid for backwards compatibility\n        if language == \"en\":\n            language = \"en-us\"\n        if language == \"zh-cn\":\n            language = \"cmn\"\n\n        super().__init__(language, punctuations=punctuations, keep_puncs=keep_puncs)\n        if backend is not None:\n            self.backend = backend\n\n    @property\n    def backend(self):\n        return self._ESPEAK_LIB\n\n    @property\n    def backend_version(self):\n        return self._ESPEAK_VER\n\n    @backend.setter\n    def backend(self, backend):\n        if backend not in [\"espeak\", \"espeak-ng\"]:\n            raise Exception(\"Unknown backend: %s\" % backend)\n        self._ESPEAK_LIB = backend\n        self._ESPEAK_VER = get_espeakng_version() if backend == \"espeak-ng\" else get_espeak_version()\n\n    def auto_set_espeak_lib(self) -> None:\n        if is_tool(\"espeak-ng\"):\n            self._ESPEAK_LIB = \"espeak-ng\"\n            self._ESPEAK_VER = get_espeakng_version()\n        elif is_tool(\"espeak\"):\n            self._ESPEAK_LIB = \"espeak\"\n            self._ESPEAK_VER = get_espeak_version()\n        else:\n            raise Exception(\"Cannot set backend automatically. espeak-ng or espeak not found\")\n\n    @staticmethod\n    def name():\n        return \"espeak\"\n\n    def phonemize_espeak(self, text: str, separator: str = \"|\", tie=False) -> str:\n        \"\"\"Convert input text to phonemes.\n\n        Args:\n            text (str):\n                Text to be converted to phonemes.\n\n            tie (bool, optional) : When True use a '͡' character between\n                consecutive characters of a single phoneme. Else separate phoneme\n                with '_'. This option requires espeak>=1.49. Default to False.\n        \"\"\"\n        # set arguments\n        args = [\"-v\", f\"{self._language}\"]\n        # espeak and espeak-ng parses `ipa` differently\n        if tie:\n            # use '͡' between phonemes\n            if self.backend == \"espeak\":\n                args.append(\"--ipa=1\")\n            else:\n                args.append(\"--ipa=3\")\n        else:\n            # split with '_'\n            if self.backend == \"espeak\":\n                if Version(self.backend_version) >= Version(\"1.48.15\"):\n                    args.append(\"--ipa=1\")\n                else:\n                    args.append(\"--ipa=3\")\n            else:\n                args.append(\"--ipa=1\")\n        if tie:\n            args.append(\"--tie=%s\" % tie)\n\n        args.append('\"' + text + '\"')\n        # compute phonemes\n        phonemes = \"\"\n        for line in _espeak_exe(self._ESPEAK_LIB, args, sync=True):\n            logging.debug(\"line: %s\", repr(line))\n            ph_decoded = line.decode(\"utf8\").strip()\n            # espeak need to skip first two characters of the retuned text:\n            #   version 1.48.03: \"_ p_ɹ_ˈaɪ_ɚ t_ə n_oʊ_v_ˈɛ_m_b_ɚ t_w_ˈɛ_n_t_i t_ˈuː\\n\"\n            #   version 1.48.15: \" p_ɹ_ˈaɪ_ɚ t_ə n_oʊ_v_ˈɛ_m_b_ɚ t_w_ˈɛ_n_t_i t_ˈuː\\n\"\n            # espeak-ng need to skip the first character of the retuned text:\n            #   \"_p_ɹ_ˈaɪ_ɚ t_ə n_oʊ_v_ˈɛ_m_b_ɚ t_w_ˈɛ_n_t_i t_ˈuː\\n\"\n\n            # dealing with the conditions descrived above\n            ph_decoded = ph_decoded[:1].replace(\"_\", \"\") + ph_decoded[1:]\n\n            # espeak-ng backend can add language flags that need to be removed:\n            #   \"sɛʁtˈɛ̃ mˈo kɔm (en)fˈʊtbɔːl(fr) ʒenˈɛʁ de- flˈaɡ də- lˈɑ̃ɡ.\"\n            # phonemize needs to remove the language flags of the returned text:\n            #   \"sɛʁtˈɛ̃ mˈo kɔm fˈʊtbɔːl ʒenˈɛʁ de- flˈaɡ də- lˈɑ̃ɡ.\"\n            ph_decoded = re.sub(r\"\\(.+?\\)\", \"\", ph_decoded)\n\n            phonemes += ph_decoded.strip()\n        return phonemes.replace(\"_\", separator)\n\n    def _phonemize(self, text, separator=None):\n        return self.phonemize_espeak(text, separator, tie=False)\n\n    @staticmethod\n    def supported_languages() -> Dict:\n        \"\"\"Get a dictionary of supported languages.\n\n        Returns:\n            Dict: Dictionary of language codes.\n        \"\"\"\n        if _DEF_ESPEAK_LIB is None:\n            return {}\n        args = [\"--voices\"]\n        langs = {}\n        count = 0\n        for line in _espeak_exe(_DEF_ESPEAK_LIB, args, sync=True):\n            line = line.decode(\"utf8\").strip()\n            if count > 0:\n                cols = line.split()\n                lang_code = cols[1]\n                lang_name = cols[3]\n                langs[lang_code] = lang_name\n            logging.debug(\"line: %s\", repr(line))\n            count += 1\n        return langs\n\n    def version(self) -> str:\n        \"\"\"Get the version of the used backend.\n\n        Returns:\n            str: Version of the used backend.\n        \"\"\"\n        args = [\"--version\"]\n        for line in _espeak_exe(self.backend, args, sync=True):\n            version = line.decode(\"utf8\").strip().split()[2]\n            logging.debug(\"line: %s\", repr(line))\n            return version\n\n    @classmethod\n    def is_available(cls):\n        \"\"\"Return true if ESpeak is available else false\"\"\"\n        return is_tool(\"espeak\") or is_tool(\"espeak-ng\")\n\n\nif __name__ == \"__main__\":\n    e = ESpeak(language=\"en-us\")\n    print(e.supported_languages())\n    print(e.version())\n    print(e.language)\n    print(e.name())\n    print(e.is_available())\n\n    e = ESpeak(language=\"en-us\", keep_puncs=False)\n    print(\"`\" + e.phonemize(\"hello how are you today?\") + \"`\")\n\n    e = ESpeak(language=\"en-us\", keep_puncs=True)\n    print(\"`\" + e.phonemize(\"hello how are you today?\") + \"`\")\n"
  },
  {
    "path": "TTS/tts/utils/text/phonemizers/gruut_wrapper.py",
    "content": "import importlib\nfrom typing import List\n\nimport gruut\nfrom gruut_ipa import IPA\n\nfrom TTS.tts.utils.text.phonemizers.base import BasePhonemizer\nfrom TTS.tts.utils.text.punctuation import Punctuation\n\n# Table for str.translate to fix gruut/TTS phoneme mismatch\nGRUUT_TRANS_TABLE = str.maketrans(\"g\", \"ɡ\")\n\n\nclass Gruut(BasePhonemizer):\n    \"\"\"Gruut wrapper for G2P\n\n    Args:\n        language (str):\n            Valid language code for the used backend.\n\n        punctuations (str):\n            Characters to be treated as punctuation. Defaults to `Punctuation.default_puncs()`.\n\n        keep_puncs (bool):\n            If true, keep the punctuations after phonemization. Defaults to True.\n\n        use_espeak_phonemes (bool):\n            If true, use espeak lexicons instead of default Gruut lexicons. Defaults to False.\n\n        keep_stress (bool):\n            If true, keep the stress characters after phonemization. Defaults to False.\n\n    Example:\n\n        >>> from TTS.tts.utils.text.phonemizers.gruut_wrapper import Gruut\n        >>> phonemizer = Gruut('en-us')\n        >>> phonemizer.phonemize(\"Be a voice, not an! echo?\", separator=\"|\")\n        'b|i| ə| v|ɔ|ɪ|s, n|ɑ|t| ə|n! ɛ|k|o|ʊ?'\n    \"\"\"\n\n    def __init__(\n        self,\n        language: str,\n        punctuations=Punctuation.default_puncs(),\n        keep_puncs=True,\n        use_espeak_phonemes=False,\n        keep_stress=False,\n    ):\n        super().__init__(language, punctuations=punctuations, keep_puncs=keep_puncs)\n        self.use_espeak_phonemes = use_espeak_phonemes\n        self.keep_stress = keep_stress\n\n    @staticmethod\n    def name():\n        return \"gruut\"\n\n    def phonemize_gruut(self, text: str, separator: str = \"|\", tie=False) -> str:  # pylint: disable=unused-argument\n        \"\"\"Convert input text to phonemes.\n\n        Gruut phonemizes the given `str` by seperating each phoneme character with `separator`, even for characters\n        that constitude a single sound.\n\n        It doesn't affect 🐸TTS since it individually converts each character to token IDs.\n\n        Examples::\n            \"hello how are you today?\" -> `h|ɛ|l|o|ʊ| h|a|ʊ| ɑ|ɹ| j|u| t|ə|d|e|ɪ`\n\n        Args:\n            text (str):\n                Text to be converted to phonemes.\n\n            tie (bool, optional) : When True use a '͡' character between\n                consecutive characters of a single phoneme. Else separate phoneme\n                with '_'. This option requires espeak>=1.49. Default to False.\n        \"\"\"\n        ph_list = []\n        for sentence in gruut.sentences(text, lang=self.language, espeak=self.use_espeak_phonemes):\n            for word in sentence:\n                if word.is_break:\n                    # Use actual character for break phoneme (e.g., comma)\n                    if ph_list:\n                        # Join with previous word\n                        ph_list[-1].append(word.text)\n                    else:\n                        # First word is punctuation\n                        ph_list.append([word.text])\n                elif word.phonemes:\n                    # Add phonemes for word\n                    word_phonemes = []\n\n                    for word_phoneme in word.phonemes:\n                        if not self.keep_stress:\n                            # Remove primary/secondary stress\n                            word_phoneme = IPA.without_stress(word_phoneme)\n\n                        word_phoneme = word_phoneme.translate(GRUUT_TRANS_TABLE)\n\n                        if word_phoneme:\n                            # Flatten phonemes\n                            word_phonemes.extend(word_phoneme)\n\n                    if word_phonemes:\n                        ph_list.append(word_phonemes)\n\n        ph_words = [separator.join(word_phonemes) for word_phonemes in ph_list]\n        ph = f\"{separator} \".join(ph_words)\n        return ph\n\n    def _phonemize(self, text, separator):\n        return self.phonemize_gruut(text, separator, tie=False)\n\n    def is_supported_language(self, language):\n        \"\"\"Returns True if `language` is supported by the backend\"\"\"\n        return gruut.is_language_supported(language)\n\n    @staticmethod\n    def supported_languages() -> List:\n        \"\"\"Get a dictionary of supported languages.\n\n        Returns:\n            List: List of language codes.\n        \"\"\"\n        return list(gruut.get_supported_languages())\n\n    def version(self):\n        \"\"\"Get the version of the used backend.\n\n        Returns:\n            str: Version of the used backend.\n        \"\"\"\n        return gruut.__version__\n\n    @classmethod\n    def is_available(cls):\n        \"\"\"Return true if ESpeak is available else false\"\"\"\n        return importlib.util.find_spec(\"gruut\") is not None\n\n\nif __name__ == \"__main__\":\n    e = Gruut(language=\"en-us\")\n    print(e.supported_languages())\n    print(e.version())\n    print(e.language)\n    print(e.name())\n    print(e.is_available())\n\n    e = Gruut(language=\"en-us\", keep_puncs=False)\n    print(\"`\" + e.phonemize(\"hello how are you today?\") + \"`\")\n\n    e = Gruut(language=\"en-us\", keep_puncs=True)\n    print(\"`\" + e.phonemize(\"hello how, are you today?\") + \"`\")\n"
  },
  {
    "path": "TTS/tts/utils/text/phonemizers/ja_jp_phonemizer.py",
    "content": "from typing import Dict\n\nfrom TTS.tts.utils.text.japanese.phonemizer import japanese_text_to_phonemes\nfrom TTS.tts.utils.text.phonemizers.base import BasePhonemizer\n\n_DEF_JA_PUNCS = \"、.,[]()?!〽~『』「」【】\"\n\n_TRANS_TABLE = {\"、\": \",\"}\n\n\ndef trans(text):\n    for i, j in _TRANS_TABLE.items():\n        text = text.replace(i, j)\n    return text\n\n\nclass JA_JP_Phonemizer(BasePhonemizer):\n    \"\"\"🐸TTS Ja-Jp phonemizer using functions in `TTS.tts.utils.text.japanese.phonemizer`\n\n    TODO: someone with JA knowledge should check this implementation\n\n    Example:\n\n        >>> from TTS.tts.utils.text.phonemizers import JA_JP_Phonemizer\n        >>> phonemizer = JA_JP_Phonemizer()\n        >>> phonemizer.phonemize(\"どちらに行きますか？\", separator=\"|\")\n        'd|o|c|h|i|r|a|n|i|i|k|i|m|a|s|u|k|a|?'\n\n    \"\"\"\n\n    language = \"ja-jp\"\n\n    def __init__(self, punctuations=_DEF_JA_PUNCS, keep_puncs=True, **kwargs):  # pylint: disable=unused-argument\n        super().__init__(self.language, punctuations=punctuations, keep_puncs=keep_puncs)\n\n    @staticmethod\n    def name():\n        return \"ja_jp_phonemizer\"\n\n    def _phonemize(self, text: str, separator: str = \"|\") -> str:\n        ph = japanese_text_to_phonemes(text)\n        if separator is not None or separator != \"\":\n            return separator.join(ph)\n        return ph\n\n    def phonemize(self, text: str, separator=\"|\", language=None) -> str:\n        \"\"\"Custom phonemize for JP_JA\n\n        Skip pre-post processing steps used by the other phonemizers.\n        \"\"\"\n        return self._phonemize(text, separator)\n\n    @staticmethod\n    def supported_languages() -> Dict:\n        return {\"ja-jp\": \"Japanese (Japan)\"}\n\n    def version(self) -> str:\n        return \"0.0.1\"\n\n    def is_available(self) -> bool:\n        return True\n\n\n# if __name__ == \"__main__\":\n#     text = \"これは、電話をかけるための私の日本語の例のテキストです。\"\n#     e = JA_JP_Phonemizer()\n#     print(e.supported_languages())\n#     print(e.version())\n#     print(e.language)\n#     print(e.name())\n#     print(e.is_available())\n#     print(\"`\" + e.phonemize(text) + \"`\")\n"
  },
  {
    "path": "TTS/tts/utils/text/phonemizers/ko_kr_phonemizer.py",
    "content": "from typing import Dict\n\nfrom TTS.tts.utils.text.korean.phonemizer import korean_text_to_phonemes\nfrom TTS.tts.utils.text.phonemizers.base import BasePhonemizer\n\n_DEF_KO_PUNCS = \"、.,[]()?!〽~『』「」【】\"\n\n\nclass KO_KR_Phonemizer(BasePhonemizer):\n    \"\"\"🐸TTS ko_kr_phonemizer using functions in `TTS.tts.utils.text.korean.phonemizer`\n\n    TODO: Add Korean to character (ᄀᄁᄂᄃᄄᄅᄆᄇᄈᄉᄊᄋᄌᄍᄎᄏᄐᄑ하ᅢᅣᅤᅥᅦᅧᅨᅩᅪᅫᅬᅭᅮᅯᅰᅱᅲᅳᅴᅵᆨᆩᆪᆫᆬᆭᆮᆯᆰᆱᆲᆳᆴᆵᆶᆷᆸᆹᆺᆻᆼᆽᆾᆿᇀᇁᇂ)\n\n    Example:\n\n        >>> from TTS.tts.utils.text.phonemizers import KO_KR_Phonemizer\n        >>> phonemizer = KO_KR_Phonemizer()\n        >>> phonemizer.phonemize(\"이 문장은 음성합성 테스트를 위한 문장입니다.\", separator=\"|\")\n        'ᄋ|ᅵ| |ᄆ|ᅮ|ᆫ|ᄌ|ᅡ|ᆼ|ᄋ|ᅳ| |ᄂ|ᅳ|ᆷ|ᄉ|ᅥ|ᆼ|ᄒ|ᅡ|ᆸ|ᄊ|ᅥ|ᆼ| |ᄐ|ᅦ|ᄉ|ᅳ|ᄐ|ᅳ|ᄅ|ᅳ| |ᄅ|ᅱ|ᄒ|ᅡ|ᆫ| |ᄆ|ᅮ|ᆫ|ᄌ|ᅡ|ᆼ|ᄋ|ᅵ|ᆷ|ᄂ|ᅵ|ᄃ|ᅡ|.'\n\n        >>> from TTS.tts.utils.text.phonemizers import KO_KR_Phonemizer\n        >>> phonemizer = KO_KR_Phonemizer()\n        >>> phonemizer.phonemize(\"이 문장은 음성합성 테스트를 위한 문장입니다.\", separator=\"|\", character='english')\n        'I| |M|u|n|J|a|n|g|E|u| |N|e|u|m|S|e|o|n|g|H|a|b|S|s|e|o|n|g| |T|e|S|e|u|T|e|u|L|e|u| |L|w|i|H|a|n| |M|u|n|J|a|n|g|I|m|N|i|D|a|.'\n\n    \"\"\"\n\n    language = \"ko-kr\"\n\n    def __init__(self, punctuations=_DEF_KO_PUNCS, keep_puncs=True, **kwargs):  # pylint: disable=unused-argument\n        super().__init__(self.language, punctuations=punctuations, keep_puncs=keep_puncs)\n\n    @staticmethod\n    def name():\n        return \"ko_kr_phonemizer\"\n\n    def _phonemize(self, text: str, separator: str = \"\", character: str = \"hangeul\") -> str:\n        ph = korean_text_to_phonemes(text, character=character)\n        if separator is not None or separator != \"\":\n            return separator.join(ph)\n        return ph\n\n    def phonemize(self, text: str, separator: str = \"\", character: str = \"hangeul\", language=None) -> str:\n        return self._phonemize(text, separator, character)\n\n    @staticmethod\n    def supported_languages() -> Dict:\n        return {\"ko-kr\": \"hangeul(korean)\"}\n\n    def version(self) -> str:\n        return \"0.0.2\"\n\n    def is_available(self) -> bool:\n        return True\n\n\nif __name__ == \"__main__\":\n    texts = \"이 문장은 음성합성 테스트를 위한 문장입니다.\"\n    e = KO_KR_Phonemizer()\n    print(e.supported_languages())\n    print(e.version())\n    print(e.language)\n    print(e.name())\n    print(e.is_available())\n    print(e.phonemize(texts))\n"
  },
  {
    "path": "TTS/tts/utils/text/phonemizers/multi_phonemizer.py",
    "content": "from typing import Dict, List\n\nfrom TTS.tts.utils.text.phonemizers import DEF_LANG_TO_PHONEMIZER, get_phonemizer_by_name\n\n\nclass MultiPhonemizer:\n    \"\"\"🐸TTS multi-phonemizer that operates phonemizers for multiple langugages\n\n    Args:\n        custom_lang_to_phonemizer (Dict):\n            Custom phonemizer mapping if you want to change the defaults. In the format of\n            `{\"lang_code\", \"phonemizer_name\"}`. When it is None, `DEF_LANG_TO_PHONEMIZER` is used. Defaults to `{}`.\n\n    TODO: find a way to pass custom kwargs to the phonemizers\n    \"\"\"\n\n    lang_to_phonemizer = {}\n\n    def __init__(self, lang_to_phonemizer_name: Dict = {}) -> None:  # pylint: disable=dangerous-default-value\n        for k, v in lang_to_phonemizer_name.items():\n            if v == \"\" and k in DEF_LANG_TO_PHONEMIZER.keys():\n                lang_to_phonemizer_name[k] = DEF_LANG_TO_PHONEMIZER[k]\n            elif v == \"\":\n                raise ValueError(f\"Phonemizer wasn't set for language {k} and doesn't have a default.\")\n        self.lang_to_phonemizer_name = lang_to_phonemizer_name\n        self.lang_to_phonemizer = self.init_phonemizers(self.lang_to_phonemizer_name)\n\n    @staticmethod\n    def init_phonemizers(lang_to_phonemizer_name: Dict) -> Dict:\n        lang_to_phonemizer = {}\n        for k, v in lang_to_phonemizer_name.items():\n            lang_to_phonemizer[k] = get_phonemizer_by_name(v, language=k)\n        return lang_to_phonemizer\n\n    @staticmethod\n    def name():\n        return \"multi-phonemizer\"\n\n    def phonemize(self, text, separator=\"|\", language=\"\"):\n        if language == \"\":\n            raise ValueError(\"Language must be set for multi-phonemizer to phonemize.\")\n        return self.lang_to_phonemizer[language].phonemize(text, separator)\n\n    def supported_languages(self) -> List:\n        return list(self.lang_to_phonemizer.keys())\n\n    def print_logs(self, level: int = 0):\n        indent = \"\\t\" * level\n        print(f\"{indent}| > phoneme language: {self.supported_languages()}\")\n        print(f\"{indent}| > phoneme backend: {self.name()}\")\n\n\n# if __name__ == \"__main__\":\n#     texts = {\n#         \"tr\": \"Merhaba, bu Türkçe bit örnek!\",\n#         \"en-us\": \"Hello, this is English example!\",\n#         \"de\": \"Hallo, das ist ein Deutches Beipiel!\",\n#         \"zh-cn\": \"这是中国的例子\",\n#     }\n#     phonemes = {}\n#     ph = MultiPhonemizer({\"tr\": \"espeak\", \"en-us\": \"\", \"de\": \"gruut\", \"zh-cn\": \"\"})\n#     for lang, text in texts.items():\n#         phoneme = ph.phonemize(text, lang)\n#         phonemes[lang] = phoneme\n#     print(phonemes)\n"
  },
  {
    "path": "TTS/tts/utils/text/phonemizers/zh_cn_phonemizer.py",
    "content": "from typing import Dict\n\nfrom TTS.tts.utils.text.chinese_mandarin.phonemizer import chinese_text_to_phonemes\nfrom TTS.tts.utils.text.phonemizers.base import BasePhonemizer\n\n_DEF_ZH_PUNCS = \"、.,[]()?!〽~『』「」【】\"\n\n\nclass ZH_CN_Phonemizer(BasePhonemizer):\n    \"\"\"🐸TTS Zh-Cn phonemizer using functions in `TTS.tts.utils.text.chinese_mandarin.phonemizer`\n\n    Args:\n        punctuations (str):\n            Set of characters to be treated as punctuation. Defaults to `_DEF_ZH_PUNCS`.\n\n        keep_puncs (bool):\n            If True, keep the punctuations after phonemization. Defaults to False.\n\n    Example ::\n\n        \"这是，样本中文。\" -> `d|ʒ|ø|4| |ʂ|ʏ|4| |，| |i|ɑ|ŋ|4|b|œ|n|3| |d|ʒ|o|ŋ|1|w|œ|n|2| |。`\n\n    TODO: someone with Mandarin knowledge should check this implementation\n    \"\"\"\n\n    language = \"zh-cn\"\n\n    def __init__(self, punctuations=_DEF_ZH_PUNCS, keep_puncs=False, **kwargs):  # pylint: disable=unused-argument\n        super().__init__(self.language, punctuations=punctuations, keep_puncs=keep_puncs)\n\n    @staticmethod\n    def name():\n        return \"zh_cn_phonemizer\"\n\n    @staticmethod\n    def phonemize_zh_cn(text: str, separator: str = \"|\") -> str:\n        ph = chinese_text_to_phonemes(text, separator)\n        return ph\n\n    def _phonemize(self, text, separator):\n        return self.phonemize_zh_cn(text, separator)\n\n    @staticmethod\n    def supported_languages() -> Dict:\n        return {\"zh-cn\": \"Chinese (China)\"}\n\n    def version(self) -> str:\n        return \"0.0.1\"\n\n    def is_available(self) -> bool:\n        return True\n\n\n# if __name__ == \"__main__\":\n#     text = \"这是，样本中文。\"\n#     e = ZH_CN_Phonemizer()\n#     print(e.supported_languages())\n#     print(e.version())\n#     print(e.language)\n#     print(e.name())\n#     print(e.is_available())\n#     print(\"`\" + e.phonemize(text) + \"`\")\n"
  },
  {
    "path": "TTS/tts/utils/text/punctuation.py",
    "content": "import collections\nimport re\nfrom enum import Enum\n\nimport six\n\n_DEF_PUNCS = ';:,.!?¡¿—…\"«»“”'\n\n_PUNC_IDX = collections.namedtuple(\"_punc_index\", [\"punc\", \"position\"])\n\n\nclass PuncPosition(Enum):\n    \"\"\"Enum for the punctuations positions\"\"\"\n\n    BEGIN = 0\n    END = 1\n    MIDDLE = 2\n    ALONE = 3\n\n\nclass Punctuation:\n    \"\"\"Handle punctuations in text.\n\n    Just strip punctuations from text or strip and restore them later.\n\n    Args:\n        puncs (str): The punctuations to be processed. Defaults to `_DEF_PUNCS`.\n\n    Example:\n        >>> punc = Punctuation()\n        >>> punc.strip(\"This is. example !\")\n        'This is example'\n\n        >>> text_striped, punc_map = punc.strip_to_restore(\"This is. example !\")\n        >>> ' '.join(text_striped)\n        'This is example'\n\n        >>> text_restored = punc.restore(text_striped, punc_map)\n        >>> text_restored[0]\n        'This is. example !'\n    \"\"\"\n\n    def __init__(self, puncs: str = _DEF_PUNCS):\n        self.puncs = puncs\n\n    @staticmethod\n    def default_puncs():\n        \"\"\"Return default set of punctuations.\"\"\"\n        return _DEF_PUNCS\n\n    @property\n    def puncs(self):\n        return self._puncs\n\n    @puncs.setter\n    def puncs(self, value):\n        if not isinstance(value, six.string_types):\n            raise ValueError(\"[!] Punctuations must be of type str.\")\n        self._puncs = \"\".join(list(dict.fromkeys(list(value))))  # remove duplicates without changing the oreder\n        self.puncs_regular_exp = re.compile(rf\"(\\s*[{re.escape(self._puncs)}]+\\s*)+\")\n\n    def strip(self, text):\n        \"\"\"Remove all the punctuations by replacing with `space`.\n\n        Args:\n            text (str): The text to be processed.\n\n        Example::\n\n            \"This is. example !\" -> \"This is example \"\n        \"\"\"\n        return re.sub(self.puncs_regular_exp, \" \", text).rstrip().lstrip()\n\n    def strip_to_restore(self, text):\n        \"\"\"Remove punctuations from text to restore them later.\n\n        Args:\n            text (str): The text to be processed.\n\n        Examples ::\n\n            \"This is. example !\" -> [[\"This is\", \"example\"], [\".\", \"!\"]]\n\n        \"\"\"\n        text, puncs = self._strip_to_restore(text)\n        return text, puncs\n\n    def _strip_to_restore(self, text):\n        \"\"\"Auxiliary method for Punctuation.preserve()\"\"\"\n        matches = list(re.finditer(self.puncs_regular_exp, text))\n        if not matches:\n            return [text], []\n        # the text is only punctuations\n        if len(matches) == 1 and matches[0].group() == text:\n            return [], [_PUNC_IDX(text, PuncPosition.ALONE)]\n        # build a punctuation map to be used later to restore punctuations\n        puncs = []\n        for match in matches:\n            position = PuncPosition.MIDDLE\n            if match == matches[0] and text.startswith(match.group()):\n                position = PuncPosition.BEGIN\n            elif match == matches[-1] and text.endswith(match.group()):\n                position = PuncPosition.END\n            puncs.append(_PUNC_IDX(match.group(), position))\n        # convert str text to a List[str], each item is separated by a punctuation\n        splitted_text = []\n        for idx, punc in enumerate(puncs):\n            split = text.split(punc.punc)\n            prefix, suffix = split[0], punc.punc.join(split[1:])\n            splitted_text.append(prefix)\n            # if the text does not end with a punctuation, add it to the last item\n            if idx == len(puncs) - 1 and len(suffix) > 0:\n                splitted_text.append(suffix)\n            text = suffix\n        return splitted_text, puncs\n\n    @classmethod\n    def restore(cls, text, puncs):\n        \"\"\"Restore punctuation in a text.\n\n        Args:\n            text (str): The text to be processed.\n            puncs (List[str]): The list of punctuations map to be used for restoring.\n\n        Examples ::\n\n            ['This is', 'example'], ['.', '!'] -> \"This is. example!\"\n\n        \"\"\"\n        return cls._restore(text, puncs, 0)\n\n    @classmethod\n    def _restore(cls, text, puncs, num):  # pylint: disable=too-many-return-statements\n        \"\"\"Auxiliary method for Punctuation.restore()\"\"\"\n        if not puncs:\n            return text\n\n        # nothing have been phonemized, returns the puncs alone\n        if not text:\n            return [\"\".join(m.punc for m in puncs)]\n\n        current = puncs[0]\n\n        if current.position == PuncPosition.BEGIN:\n            return cls._restore([current.punc + text[0]] + text[1:], puncs[1:], num)\n\n        if current.position == PuncPosition.END:\n            return [text[0] + current.punc] + cls._restore(text[1:], puncs[1:], num + 1)\n\n        if current.position == PuncPosition.ALONE:\n            return [current.mark] + cls._restore(text, puncs[1:], num + 1)\n\n        # POSITION == MIDDLE\n        if len(text) == 1:  # pragma: nocover\n            # a corner case where the final part of an intermediate\n            # mark (I) has not been phonemized\n            return cls._restore([text[0] + current.punc], puncs[1:], num)\n\n        return cls._restore([text[0] + current.punc + text[1]] + text[2:], puncs[1:], num)\n\n\n# if __name__ == \"__main__\":\n#     punc = Punctuation()\n#     text = \"This is. This is, example!\"\n\n#     print(punc.strip(text))\n\n#     split_text, puncs = punc.strip_to_restore(text)\n#     print(split_text, \" ---- \", puncs)\n\n#     restored_text = punc.restore(split_text, puncs)\n#     print(restored_text)\n"
  },
  {
    "path": "TTS/tts/utils/text/tokenizer.py",
    "content": "from typing import Callable, Dict, List, Union\n\nfrom TTS.tts.utils.text import cleaners\nfrom TTS.tts.utils.text.characters import Graphemes, IPAPhonemes\nfrom TTS.tts.utils.text.phonemizers import DEF_LANG_TO_PHONEMIZER, get_phonemizer_by_name\nfrom TTS.tts.utils.text.phonemizers.multi_phonemizer import MultiPhonemizer\nfrom TTS.utils.generic_utils import get_import_path, import_class\n\n\nclass TTSTokenizer:\n    \"\"\"🐸TTS tokenizer to convert input characters to token IDs and back.\n\n    Token IDs for OOV chars are discarded but those are stored in `self.not_found_characters` for later.\n\n    Args:\n        use_phonemes (bool):\n            Whether to use phonemes instead of characters. Defaults to False.\n\n        characters (Characters):\n            A Characters object to use for character-to-ID and ID-to-character mappings.\n\n        text_cleaner (callable):\n            A function to pre-process the text before tokenization and phonemization. Defaults to None.\n\n        phonemizer (Phonemizer):\n            A phonemizer object or a dict that maps language codes to phonemizer objects. Defaults to None.\n\n    Example:\n\n        >>> from TTS.tts.utils.text.tokenizer import TTSTokenizer\n        >>> tokenizer = TTSTokenizer(use_phonemes=False, characters=Graphemes())\n        >>> text = \"Hello world!\"\n        >>> ids = tokenizer.text_to_ids(text)\n        >>> text_hat = tokenizer.ids_to_text(ids)\n        >>> assert text == text_hat\n    \"\"\"\n\n    def __init__(\n        self,\n        use_phonemes=False,\n        text_cleaner: Callable = None,\n        characters: \"BaseCharacters\" = None,\n        phonemizer: Union[\"Phonemizer\", Dict] = None,\n        add_blank: bool = False,\n        use_eos_bos=False,\n    ):\n        self.text_cleaner = text_cleaner\n        self.use_phonemes = use_phonemes\n        self.add_blank = add_blank\n        self.use_eos_bos = use_eos_bos\n        self.characters = characters\n        self.not_found_characters = []\n        self.phonemizer = phonemizer\n\n    @property\n    def characters(self):\n        return self._characters\n\n    @characters.setter\n    def characters(self, new_characters):\n        self._characters = new_characters\n        self.pad_id = self.characters.char_to_id(self.characters.pad) if self.characters.pad else None\n        self.blank_id = self.characters.char_to_id(self.characters.blank) if self.characters.blank else None\n\n    def encode(self, text: str) -> List[int]:\n        \"\"\"Encodes a string of text as a sequence of IDs.\"\"\"\n        token_ids = []\n        for char in text:\n            try:\n                idx = self.characters.char_to_id(char)\n                token_ids.append(idx)\n            except KeyError:\n                # discard but store not found characters\n                if char not in self.not_found_characters:\n                    self.not_found_characters.append(char)\n                    print(text)\n                    print(f\" [!] Character {repr(char)} not found in the vocabulary. Discarding it.\")\n        return token_ids\n\n    def decode(self, token_ids: List[int]) -> str:\n        \"\"\"Decodes a sequence of IDs to a string of text.\"\"\"\n        text = \"\"\n        for token_id in token_ids:\n            text += self.characters.id_to_char(token_id)\n        return text\n\n    def text_to_ids(self, text: str, language: str = None) -> List[int]:  # pylint: disable=unused-argument\n        \"\"\"Converts a string of text to a sequence of token IDs.\n\n        Args:\n            text(str):\n                The text to convert to token IDs.\n\n            language(str):\n                The language code of the text. Defaults to None.\n\n        TODO:\n            - Add support for language-specific processing.\n\n        1. Text normalizatin\n        2. Phonemization (if use_phonemes is True)\n        3. Add blank char between characters\n        4. Add BOS and EOS characters\n        5. Text to token IDs\n        \"\"\"\n        # TODO: text cleaner should pick the right routine based on the language\n        if self.text_cleaner is not None:\n            text = self.text_cleaner(text)\n        if self.use_phonemes:\n            text = self.phonemizer.phonemize(text, separator=\"\", language=language)\n        if self.add_blank:\n            text = self.intersperse_blank_char(text, True)\n        if self.use_eos_bos:\n            text = self.pad_with_bos_eos(text)\n        return self.encode(text)\n\n    def ids_to_text(self, id_sequence: List[int]) -> str:\n        \"\"\"Converts a sequence of token IDs to a string of text.\"\"\"\n        return self.decode(id_sequence)\n\n    def pad_with_bos_eos(self, char_sequence: List[str]):\n        \"\"\"Pads a sequence with the special BOS and EOS characters.\"\"\"\n        return [self.characters.bos] + list(char_sequence) + [self.characters.eos]\n\n    def intersperse_blank_char(self, char_sequence: List[str], use_blank_char: bool = False):\n        \"\"\"Intersperses the blank character between characters in a sequence.\n\n        Use the ```blank``` character if defined else use the ```pad``` character.\n        \"\"\"\n        char_to_use = self.characters.blank if use_blank_char else self.characters.pad\n        result = [char_to_use] * (len(char_sequence) * 2 + 1)\n        result[1::2] = char_sequence\n        return result\n\n    def print_logs(self, level: int = 0):\n        indent = \"\\t\" * level\n        print(f\"{indent}| > add_blank: {self.add_blank}\")\n        print(f\"{indent}| > use_eos_bos: {self.use_eos_bos}\")\n        print(f\"{indent}| > use_phonemes: {self.use_phonemes}\")\n        if self.use_phonemes:\n            print(f\"{indent}| > phonemizer:\")\n            self.phonemizer.print_logs(level + 1)\n        if len(self.not_found_characters) > 0:\n            print(f\"{indent}| > {len(self.not_found_characters)} not found characters:\")\n            for char in self.not_found_characters:\n                print(f\"{indent}| > {char}\")\n\n    @staticmethod\n    def init_from_config(config: \"Coqpit\", characters: \"BaseCharacters\" = None):\n        \"\"\"Init Tokenizer object from config\n\n        Args:\n            config (Coqpit): Coqpit model config.\n            characters (BaseCharacters): Defines the model character set. If not set, use the default options based on\n                the config values. Defaults to None.\n        \"\"\"\n        # init cleaners\n        text_cleaner = None\n        if isinstance(config.text_cleaner, (str, list)):\n            text_cleaner = getattr(cleaners, config.text_cleaner)\n\n        # init characters\n        if characters is None:\n            # set characters based on defined characters class\n            if config.characters and config.characters.characters_class:\n                CharactersClass = import_class(config.characters.characters_class)\n                characters, new_config = CharactersClass.init_from_config(config)\n            # set characters based on config\n            else:\n                if config.use_phonemes:\n                    # init phoneme set\n                    characters, new_config = IPAPhonemes().init_from_config(config)\n                else:\n                    # init character set\n                    characters, new_config = Graphemes().init_from_config(config)\n\n        else:\n            characters, new_config = characters.init_from_config(config)\n\n        # set characters class\n        new_config.characters.characters_class = get_import_path(characters)\n\n        # init phonemizer\n        phonemizer = None\n        if config.use_phonemes:\n            if \"phonemizer\" in config and config.phonemizer == \"multi_phonemizer\":\n                lang_to_phonemizer_name = {}\n                for dataset in config.datasets:\n                    if dataset.language != \"\":\n                        lang_to_phonemizer_name[dataset.language] = dataset.phonemizer\n                    else:\n                        raise ValueError(\"Multi phonemizer requires language to be set for each dataset.\")\n                phonemizer = MultiPhonemizer(lang_to_phonemizer_name)\n            else:\n                phonemizer_kwargs = {\"language\": config.phoneme_language}\n                if \"phonemizer\" in config and config.phonemizer:\n                    phonemizer = get_phonemizer_by_name(config.phonemizer, **phonemizer_kwargs)\n                else:\n                    try:\n                        phonemizer = get_phonemizer_by_name(\n                            DEF_LANG_TO_PHONEMIZER[config.phoneme_language], **phonemizer_kwargs\n                        )\n                        new_config.phonemizer = phonemizer.name()\n                    except KeyError as e:\n                        raise ValueError(\n                            f\"\"\"No phonemizer found for language {config.phoneme_language}.\n                            You may need to install a third party library for this language.\"\"\"\n                        ) from e\n\n        return (\n            TTSTokenizer(\n                config.use_phonemes, text_cleaner, characters, phonemizer, config.add_blank, config.enable_eos_bos_chars\n            ),\n            new_config,\n        )\n"
  },
  {
    "path": "TTS/tts/utils/visual.py",
    "content": "import librosa\nimport matplotlib\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport torch\nfrom matplotlib.colors import LogNorm\n\nmatplotlib.use(\"Agg\")\n\n\ndef plot_alignment(alignment, info=None, fig_size=(16, 10), title=None, output_fig=False, plot_log=False):\n    if isinstance(alignment, torch.Tensor):\n        alignment_ = alignment.detach().cpu().numpy().squeeze()\n    else:\n        alignment_ = alignment\n    alignment_ = alignment_.astype(np.float32) if alignment_.dtype == np.float16 else alignment_\n    fig, ax = plt.subplots(figsize=fig_size)\n    im = ax.imshow(\n        alignment_.T, aspect=\"auto\", origin=\"lower\", interpolation=\"none\", norm=LogNorm() if plot_log else None\n    )\n    fig.colorbar(im, ax=ax)\n    xlabel = \"Decoder timestep\"\n    if info is not None:\n        xlabel += \"\\n\\n\" + info\n    plt.xlabel(xlabel)\n    plt.ylabel(\"Encoder timestep\")\n    # plt.yticks(range(len(text)), list(text))\n    plt.tight_layout()\n    if title is not None:\n        plt.title(title)\n    if not output_fig:\n        plt.close()\n    return fig\n\n\ndef plot_spectrogram(spectrogram, ap=None, fig_size=(16, 10), output_fig=False):\n    if isinstance(spectrogram, torch.Tensor):\n        spectrogram_ = spectrogram.detach().cpu().numpy().squeeze().T\n    else:\n        spectrogram_ = spectrogram.T\n    spectrogram_ = spectrogram_.astype(np.float32) if spectrogram_.dtype == np.float16 else spectrogram_\n    if ap is not None:\n        spectrogram_ = ap.denormalize(spectrogram_)  # pylint: disable=protected-access\n    fig = plt.figure(figsize=fig_size)\n    plt.imshow(spectrogram_, aspect=\"auto\", origin=\"lower\")\n    plt.colorbar()\n    plt.tight_layout()\n    if not output_fig:\n        plt.close()\n    return fig\n\n\ndef plot_pitch(pitch, spectrogram, ap=None, fig_size=(30, 10), output_fig=False):\n    \"\"\"Plot pitch curves on top of the spectrogram.\n\n    Args:\n        pitch (np.array): Pitch values.\n        spectrogram (np.array): Spectrogram values.\n\n    Shapes:\n        pitch: :math:`(T,)`\n        spec: :math:`(C, T)`\n    \"\"\"\n\n    if isinstance(spectrogram, torch.Tensor):\n        spectrogram_ = spectrogram.detach().cpu().numpy().squeeze().T\n    else:\n        spectrogram_ = spectrogram.T\n    spectrogram_ = spectrogram_.astype(np.float32) if spectrogram_.dtype == np.float16 else spectrogram_\n    if ap is not None:\n        spectrogram_ = ap.denormalize(spectrogram_)  # pylint: disable=protected-access\n\n    old_fig_size = plt.rcParams[\"figure.figsize\"]\n    if fig_size is not None:\n        plt.rcParams[\"figure.figsize\"] = fig_size\n\n    fig, ax = plt.subplots()\n\n    ax.imshow(spectrogram_, aspect=\"auto\", origin=\"lower\")\n    ax.set_xlabel(\"time\")\n    ax.set_ylabel(\"spec_freq\")\n\n    ax2 = ax.twinx()\n    ax2.plot(pitch, linewidth=5.0, color=\"red\")\n    ax2.set_ylabel(\"F0\")\n\n    plt.rcParams[\"figure.figsize\"] = old_fig_size\n    if not output_fig:\n        plt.close()\n    return fig\n\n\ndef plot_avg_pitch(pitch, chars, fig_size=(30, 10), output_fig=False):\n    \"\"\"Plot pitch curves on top of the input characters.\n\n    Args:\n        pitch (np.array): Pitch values.\n        chars (str): Characters to place to the x-axis.\n\n    Shapes:\n        pitch: :math:`(T,)`\n    \"\"\"\n    old_fig_size = plt.rcParams[\"figure.figsize\"]\n    if fig_size is not None:\n        plt.rcParams[\"figure.figsize\"] = fig_size\n\n    fig, ax = plt.subplots()\n\n    x = np.array(range(len(chars)))\n    my_xticks = chars\n    plt.xticks(x, my_xticks)\n\n    ax.set_xlabel(\"characters\")\n    ax.set_ylabel(\"freq\")\n\n    ax2 = ax.twinx()\n    ax2.plot(pitch, linewidth=5.0, color=\"red\")\n    ax2.set_ylabel(\"F0\")\n\n    plt.rcParams[\"figure.figsize\"] = old_fig_size\n    if not output_fig:\n        plt.close()\n    return fig\n\n\ndef plot_avg_energy(energy, chars, fig_size=(30, 10), output_fig=False):\n    \"\"\"Plot energy curves on top of the input characters.\n\n    Args:\n        energy (np.array): energy values.\n        chars (str): Characters to place to the x-axis.\n\n    Shapes:\n        energy: :math:`(T,)`\n    \"\"\"\n    old_fig_size = plt.rcParams[\"figure.figsize\"]\n    if fig_size is not None:\n        plt.rcParams[\"figure.figsize\"] = fig_size\n\n    fig, ax = plt.subplots()\n\n    x = np.array(range(len(chars)))\n    my_xticks = chars\n    plt.xticks(x, my_xticks)\n\n    ax.set_xlabel(\"characters\")\n    ax.set_ylabel(\"freq\")\n\n    ax2 = ax.twinx()\n    ax2.plot(energy, linewidth=5.0, color=\"red\")\n    ax2.set_ylabel(\"energy\")\n\n    plt.rcParams[\"figure.figsize\"] = old_fig_size\n    if not output_fig:\n        plt.close()\n    return fig\n\n\ndef visualize(\n    alignment,\n    postnet_output,\n    text,\n    hop_length,\n    CONFIG,\n    tokenizer,\n    stop_tokens=None,\n    decoder_output=None,\n    output_path=None,\n    figsize=(8, 24),\n    output_fig=False,\n):\n    \"\"\"Intended to be used in Notebooks.\"\"\"\n\n    if decoder_output is not None:\n        num_plot = 4\n    else:\n        num_plot = 3\n\n    label_fontsize = 16\n    fig = plt.figure(figsize=figsize)\n\n    plt.subplot(num_plot, 1, 1)\n    plt.imshow(alignment.T, aspect=\"auto\", origin=\"lower\", interpolation=None)\n    plt.xlabel(\"Decoder timestamp\", fontsize=label_fontsize)\n    plt.ylabel(\"Encoder timestamp\", fontsize=label_fontsize)\n    # compute phoneme representation and back\n    if CONFIG.use_phonemes:\n        seq = tokenizer.text_to_ids(text)\n        text = tokenizer.ids_to_text(seq)\n        print(text)\n    plt.yticks(range(len(text)), list(text))\n    plt.colorbar()\n\n    if stop_tokens is not None:\n        # plot stopnet predictions\n        plt.subplot(num_plot, 1, 2)\n        plt.plot(range(len(stop_tokens)), list(stop_tokens))\n\n    # plot postnet spectrogram\n    plt.subplot(num_plot, 1, 3)\n    librosa.display.specshow(\n        postnet_output.T,\n        sr=CONFIG.audio[\"sample_rate\"],\n        hop_length=hop_length,\n        x_axis=\"time\",\n        y_axis=\"linear\",\n        fmin=CONFIG.audio[\"mel_fmin\"],\n        fmax=CONFIG.audio[\"mel_fmax\"],\n    )\n\n    plt.xlabel(\"Time\", fontsize=label_fontsize)\n    plt.ylabel(\"Hz\", fontsize=label_fontsize)\n    plt.tight_layout()\n    plt.colorbar()\n\n    if decoder_output is not None:\n        plt.subplot(num_plot, 1, 4)\n        librosa.display.specshow(\n            decoder_output.T,\n            sr=CONFIG.audio[\"sample_rate\"],\n            hop_length=hop_length,\n            x_axis=\"time\",\n            y_axis=\"linear\",\n            fmin=CONFIG.audio[\"mel_fmin\"],\n            fmax=CONFIG.audio[\"mel_fmax\"],\n        )\n        plt.xlabel(\"Time\", fontsize=label_fontsize)\n        plt.ylabel(\"Hz\", fontsize=label_fontsize)\n        plt.tight_layout()\n        plt.colorbar()\n\n    if output_path:\n        print(output_path)\n        fig.savefig(output_path)\n        plt.close()\n\n    if not output_fig:\n        plt.close()\n"
  },
  {
    "path": "TTS/utils/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/utils/audio/__init__.py",
    "content": "from TTS.utils.audio.processor import AudioProcessor\n"
  },
  {
    "path": "TTS/utils/audio/numpy_transforms.py",
    "content": "from typing import Tuple\n\nimport librosa\nimport numpy as np\nimport scipy\nimport soundfile as sf\nfrom librosa import magphase, pyin\n\n# For using kwargs\n# pylint: disable=unused-argument\n\n\ndef build_mel_basis(\n    *,\n    sample_rate: int = None,\n    fft_size: int = None,\n    num_mels: int = None,\n    mel_fmax: int = None,\n    mel_fmin: int = None,\n    **kwargs,\n) -> np.ndarray:\n    \"\"\"Build melspectrogram basis.\n\n    Returns:\n        np.ndarray: melspectrogram basis.\n    \"\"\"\n    if mel_fmax is not None:\n        assert mel_fmax <= sample_rate // 2\n        assert mel_fmax - mel_fmin > 0\n    return librosa.filters.mel(sr=sample_rate, n_fft=fft_size, n_mels=num_mels, fmin=mel_fmin, fmax=mel_fmax)\n\n\ndef millisec_to_length(\n    *, frame_length_ms: int = None, frame_shift_ms: int = None, sample_rate: int = None, **kwargs\n) -> Tuple[int, int]:\n    \"\"\"Compute hop and window length from milliseconds.\n\n    Returns:\n        Tuple[int, int]: hop length and window length for STFT.\n    \"\"\"\n    factor = frame_length_ms / frame_shift_ms\n    assert (factor).is_integer(), \" [!] frame_shift_ms should divide frame_length_ms\"\n    win_length = int(frame_length_ms / 1000.0 * sample_rate)\n    hop_length = int(win_length / float(factor))\n    return win_length, hop_length\n\n\ndef _log(x, base):\n    if base == 10:\n        return np.log10(x)\n    return np.log(x)\n\n\ndef _exp(x, base):\n    if base == 10:\n        return np.power(10, x)\n    return np.exp(x)\n\n\ndef amp_to_db(*, x: np.ndarray = None, gain: float = 1, base: int = 10, **kwargs) -> np.ndarray:\n    \"\"\"Convert amplitude values to decibels.\n\n    Args:\n        x (np.ndarray): Amplitude spectrogram.\n        gain (float): Gain factor. Defaults to 1.\n        base (int): Logarithm base. Defaults to 10.\n\n    Returns:\n        np.ndarray: Decibels spectrogram.\n    \"\"\"\n    assert (x < 0).sum() == 0, \" [!] Input values must be non-negative.\"\n    return gain * _log(np.maximum(1e-8, x), base)\n\n\n# pylint: disable=no-self-use\ndef db_to_amp(*, x: np.ndarray = None, gain: float = 1, base: int = 10, **kwargs) -> np.ndarray:\n    \"\"\"Convert decibels spectrogram to amplitude spectrogram.\n\n    Args:\n        x (np.ndarray): Decibels spectrogram.\n        gain (float): Gain factor. Defaults to 1.\n        base (int): Logarithm base. Defaults to 10.\n\n    Returns:\n        np.ndarray: Amplitude spectrogram.\n    \"\"\"\n    return _exp(x / gain, base)\n\n\ndef preemphasis(*, x: np.ndarray, coef: float = 0.97, **kwargs) -> np.ndarray:\n    \"\"\"Apply pre-emphasis to the audio signal. Useful to reduce the correlation between neighbouring signal values.\n\n    Args:\n        x (np.ndarray): Audio signal.\n\n    Raises:\n        RuntimeError: Preemphasis coeff is set to 0.\n\n    Returns:\n        np.ndarray: Decorrelated audio signal.\n    \"\"\"\n    if coef == 0:\n        raise RuntimeError(\" [!] Preemphasis is set 0.0.\")\n    return scipy.signal.lfilter([1, -coef], [1], x)\n\n\ndef deemphasis(*, x: np.ndarray = None, coef: float = 0.97, **kwargs) -> np.ndarray:\n    \"\"\"Reverse pre-emphasis.\"\"\"\n    if coef == 0:\n        raise RuntimeError(\" [!] Preemphasis is set 0.0.\")\n    return scipy.signal.lfilter([1], [1, -coef], x)\n\n\ndef spec_to_mel(*, spec: np.ndarray, mel_basis: np.ndarray = None, **kwargs) -> np.ndarray:\n    \"\"\"Convert a full scale linear spectrogram output of a network to a melspectrogram.\n\n    Args:\n        spec (np.ndarray): Normalized full scale linear spectrogram.\n\n    Shapes:\n        - spec: :math:`[C, T]`\n\n    Returns:\n        np.ndarray: Normalized melspectrogram.\n    \"\"\"\n    return np.dot(mel_basis, spec)\n\n\ndef mel_to_spec(*, mel: np.ndarray = None, mel_basis: np.ndarray = None, **kwargs) -> np.ndarray:\n    \"\"\"Convert a melspectrogram to full scale spectrogram.\"\"\"\n    assert (mel < 0).sum() == 0, \" [!] Input values must be non-negative.\"\n    inv_mel_basis = np.linalg.pinv(mel_basis)\n    return np.maximum(1e-10, np.dot(inv_mel_basis, mel))\n\n\ndef wav_to_spec(*, wav: np.ndarray = None, **kwargs) -> np.ndarray:\n    \"\"\"Compute a spectrogram from a waveform.\n\n    Args:\n        wav (np.ndarray): Waveform. Shape :math:`[T_wav,]`\n\n    Returns:\n        np.ndarray: Spectrogram. Shape :math:`[C, T_spec]`. :math:`T_spec == T_wav / hop_length`\n    \"\"\"\n    D = stft(y=wav, **kwargs)\n    S = np.abs(D)\n    return S.astype(np.float32)\n\n\ndef wav_to_mel(*, wav: np.ndarray = None, mel_basis=None, **kwargs) -> np.ndarray:\n    \"\"\"Compute a melspectrogram from a waveform.\"\"\"\n    D = stft(y=wav, **kwargs)\n    S = spec_to_mel(spec=np.abs(D), mel_basis=mel_basis, **kwargs)\n    return S.astype(np.float32)\n\n\ndef spec_to_wav(*, spec: np.ndarray, power: float = 1.5, **kwargs) -> np.ndarray:\n    \"\"\"Convert a spectrogram to a waveform using Griffi-Lim vocoder.\"\"\"\n    S = spec.copy()\n    return griffin_lim(spec=S**power, **kwargs)\n\n\ndef mel_to_wav(*, mel: np.ndarray = None, power: float = 1.5, **kwargs) -> np.ndarray:\n    \"\"\"Convert a melspectrogram to a waveform using Griffi-Lim vocoder.\"\"\"\n    S = mel.copy()\n    S = mel_to_spec(mel=S, mel_basis=kwargs[\"mel_basis\"])  # Convert back to linear\n    return griffin_lim(spec=S**power, **kwargs)\n\n\n### STFT and ISTFT ###\ndef stft(\n    *,\n    y: np.ndarray = None,\n    fft_size: int = None,\n    hop_length: int = None,\n    win_length: int = None,\n    pad_mode: str = \"reflect\",\n    window: str = \"hann\",\n    center: bool = True,\n    **kwargs,\n) -> np.ndarray:\n    \"\"\"Librosa STFT wrapper.\n\n    Check http://librosa.org/doc/main/generated/librosa.stft.html argument details.\n\n    Returns:\n        np.ndarray: Complex number array.\n    \"\"\"\n    return librosa.stft(\n        y=y,\n        n_fft=fft_size,\n        hop_length=hop_length,\n        win_length=win_length,\n        pad_mode=pad_mode,\n        window=window,\n        center=center,\n    )\n\n\ndef istft(\n    *,\n    y: np.ndarray = None,\n    fft_size: int = None,\n    hop_length: int = None,\n    win_length: int = None,\n    window: str = \"hann\",\n    center: bool = True,\n    **kwargs,\n) -> np.ndarray:\n    \"\"\"Librosa iSTFT wrapper.\n\n    Check http://librosa.org/doc/main/generated/librosa.istft.html argument details.\n\n    Returns:\n        np.ndarray: Complex number array.\n    \"\"\"\n    return librosa.istft(y, hop_length=hop_length, win_length=win_length, center=center, window=window)\n\n\ndef griffin_lim(*, spec: np.ndarray = None, num_iter=60, **kwargs) -> np.ndarray:\n    angles = np.exp(2j * np.pi * np.random.rand(*spec.shape))\n    S_complex = np.abs(spec).astype(np.complex)\n    y = istft(y=S_complex * angles, **kwargs)\n    if not np.isfinite(y).all():\n        print(\" [!] Waveform is not finite everywhere. Skipping the GL.\")\n        return np.array([0.0])\n    for _ in range(num_iter):\n        angles = np.exp(1j * np.angle(stft(y=y, **kwargs)))\n        y = istft(y=S_complex * angles, **kwargs)\n    return y\n\n\ndef compute_stft_paddings(\n    *, x: np.ndarray = None, hop_length: int = None, pad_two_sides: bool = False, **kwargs\n) -> Tuple[int, int]:\n    \"\"\"Compute paddings used by Librosa's STFT. Compute right padding (final frame) or both sides padding\n    (first and final frames)\"\"\"\n    pad = (x.shape[0] // hop_length + 1) * hop_length - x.shape[0]\n    if not pad_two_sides:\n        return 0, pad\n    return pad // 2, pad // 2 + pad % 2\n\n\ndef compute_f0(\n    *,\n    x: np.ndarray = None,\n    pitch_fmax: float = None,\n    pitch_fmin: float = None,\n    hop_length: int = None,\n    win_length: int = None,\n    sample_rate: int = None,\n    stft_pad_mode: str = \"reflect\",\n    center: bool = True,\n    **kwargs,\n) -> np.ndarray:\n    \"\"\"Compute pitch (f0) of a waveform using the same parameters used for computing melspectrogram.\n\n    Args:\n        x (np.ndarray): Waveform. Shape :math:`[T_wav,]`\n        pitch_fmax (float): Pitch max value.\n        pitch_fmin (float): Pitch min value.\n        hop_length (int): Number of frames between STFT columns.\n        win_length (int): STFT window length.\n        sample_rate (int): Audio sampling rate.\n        stft_pad_mode (str): Padding mode for STFT.\n        center (bool): Centered padding.\n\n    Returns:\n        np.ndarray: Pitch. Shape :math:`[T_pitch,]`. :math:`T_pitch == T_wav / hop_length`\n\n    Examples:\n        >>> WAV_FILE = filename = librosa.util.example_audio_file()\n        >>> from TTS.config import BaseAudioConfig\n        >>> from TTS.utils.audio import AudioProcessor\n        >>> conf = BaseAudioConfig(pitch_fmax=640, pitch_fmin=1)\n        >>> ap = AudioProcessor(**conf)\n        >>> wav = ap.load_wav(WAV_FILE, sr=ap.sample_rate)[:5 * ap.sample_rate]\n        >>> pitch = ap.compute_f0(wav)\n    \"\"\"\n    assert pitch_fmax is not None, \" [!] Set `pitch_fmax` before caling `compute_f0`.\"\n    assert pitch_fmin is not None, \" [!] Set `pitch_fmin` before caling `compute_f0`.\"\n\n    f0, voiced_mask, _ = pyin(\n        y=x.astype(np.double),\n        fmin=pitch_fmin,\n        fmax=pitch_fmax,\n        sr=sample_rate,\n        frame_length=win_length,\n        win_length=win_length // 2,\n        hop_length=hop_length,\n        pad_mode=stft_pad_mode,\n        center=center,\n        n_thresholds=100,\n        beta_parameters=(2, 18),\n        boltzmann_parameter=2,\n        resolution=0.1,\n        max_transition_rate=35.92,\n        switch_prob=0.01,\n        no_trough_prob=0.01,\n    )\n    f0[~voiced_mask] = 0.0\n\n    return f0\n\n\ndef compute_energy(y: np.ndarray, **kwargs) -> np.ndarray:\n    \"\"\"Compute energy of a waveform using the same parameters used for computing melspectrogram.\n    Args:\n      x (np.ndarray): Waveform. Shape :math:`[T_wav,]`\n    Returns:\n      np.ndarray: energy. Shape :math:`[T_energy,]`. :math:`T_energy == T_wav / hop_length`\n    Examples:\n      >>> WAV_FILE = filename = librosa.util.example_audio_file()\n      >>> from TTS.config import BaseAudioConfig\n      >>> from TTS.utils.audio import AudioProcessor\n      >>> conf = BaseAudioConfig()\n      >>> ap = AudioProcessor(**conf)\n      >>> wav = ap.load_wav(WAV_FILE, sr=ap.sample_rate)[:5 * ap.sample_rate]\n      >>> energy = ap.compute_energy(wav)\n    \"\"\"\n    x = stft(y=y, **kwargs)\n    mag, _ = magphase(x)\n    energy = np.sqrt(np.sum(mag**2, axis=0))\n    return energy\n\n\n### Audio Processing ###\ndef find_endpoint(\n    *,\n    wav: np.ndarray = None,\n    trim_db: float = -40,\n    sample_rate: int = None,\n    min_silence_sec=0.8,\n    gain: float = None,\n    base: int = None,\n    **kwargs,\n) -> int:\n    \"\"\"Find the last point without silence at the end of a audio signal.\n\n    Args:\n        wav (np.ndarray): Audio signal.\n        threshold_db (int, optional): Silence threshold in decibels. Defaults to -40.\n        min_silence_sec (float, optional): Ignore silences that are shorter then this in secs. Defaults to 0.8.\n        gian (float, optional): Gain to be used to convert trim_db to trim_amp. Defaults to None.\n        base (int, optional): Base of the logarithm used to convert trim_db to trim_amp. Defaults to 10.\n\n    Returns:\n        int: Last point without silence.\n    \"\"\"\n    window_length = int(sample_rate * min_silence_sec)\n    hop_length = int(window_length / 4)\n    threshold = db_to_amp(x=-trim_db, gain=gain, base=base)\n    for x in range(hop_length, len(wav) - window_length, hop_length):\n        if np.max(wav[x : x + window_length]) < threshold:\n            return x + hop_length\n    return len(wav)\n\n\ndef trim_silence(\n    *,\n    wav: np.ndarray = None,\n    sample_rate: int = None,\n    trim_db: float = None,\n    win_length: int = None,\n    hop_length: int = None,\n    **kwargs,\n) -> np.ndarray:\n    \"\"\"Trim silent parts with a threshold and 0.01 sec margin\"\"\"\n    margin = int(sample_rate * 0.01)\n    wav = wav[margin:-margin]\n    return librosa.effects.trim(wav, top_db=trim_db, frame_length=win_length, hop_length=hop_length)[0]\n\n\ndef volume_norm(*, x: np.ndarray = None, coef: float = 0.95, **kwargs) -> np.ndarray:\n    \"\"\"Normalize the volume of an audio signal.\n\n    Args:\n        x (np.ndarray): Raw waveform.\n        coef (float): Coefficient to rescale the maximum value. Defaults to 0.95.\n\n    Returns:\n        np.ndarray: Volume normalized waveform.\n    \"\"\"\n    return x / abs(x).max() * coef\n\n\ndef rms_norm(*, wav: np.ndarray = None, db_level: float = -27.0, **kwargs) -> np.ndarray:\n    r = 10 ** (db_level / 20)\n    a = np.sqrt((len(wav) * (r**2)) / np.sum(wav**2))\n    return wav * a\n\n\ndef rms_volume_norm(*, x: np.ndarray, db_level: float = -27.0, **kwargs) -> np.ndarray:\n    \"\"\"Normalize the volume based on RMS of the signal.\n\n    Args:\n        x (np.ndarray): Raw waveform.\n        db_level (float): Target dB level in RMS. Defaults to -27.0.\n\n    Returns:\n        np.ndarray: RMS normalized waveform.\n    \"\"\"\n    assert -99 <= db_level <= 0, \" [!] db_level should be between -99 and 0\"\n    wav = rms_norm(wav=x, db_level=db_level)\n    return wav\n\n\ndef load_wav(*, filename: str, sample_rate: int = None, resample: bool = False, **kwargs) -> np.ndarray:\n    \"\"\"Read a wav file using Librosa and optionally resample, silence trim, volume normalize.\n\n    Resampling slows down loading the file significantly. Therefore it is recommended to resample the file before.\n\n    Args:\n        filename (str): Path to the wav file.\n        sr (int, optional): Sampling rate for resampling. Defaults to None.\n        resample (bool, optional): Resample the audio file when loading. Slows down the I/O time. Defaults to False.\n\n    Returns:\n        np.ndarray: Loaded waveform.\n    \"\"\"\n    if resample:\n        # loading with resampling. It is significantly slower.\n        x, _ = librosa.load(filename, sr=sample_rate)\n    else:\n        # SF is faster than librosa for loading files\n        x, _ = sf.read(filename)\n    return x\n\n\ndef save_wav(*, wav: np.ndarray, path: str, sample_rate: int = None, **kwargs) -> None:\n    \"\"\"Save float waveform to a file using Scipy.\n\n    Args:\n        wav (np.ndarray): Waveform with float values in range [-1, 1] to save.\n        path (str): Path to a output file.\n        sr (int, optional): Sampling rate used for saving to the file. Defaults to None.\n    \"\"\"\n    wav_norm = wav * (32767 / max(0.01, np.max(np.abs(wav))))\n    scipy.io.wavfile.write(path, sample_rate, wav_norm.astype(np.int16))\n\n\ndef mulaw_encode(*, wav: np.ndarray, mulaw_qc: int, **kwargs) -> np.ndarray:\n    mu = 2**mulaw_qc - 1\n    signal = np.sign(wav) * np.log(1 + mu * np.abs(wav)) / np.log(1.0 + mu)\n    signal = (signal + 1) / 2 * mu + 0.5\n    return np.floor(\n        signal,\n    )\n\n\ndef mulaw_decode(*, wav, mulaw_qc: int, **kwargs) -> np.ndarray:\n    \"\"\"Recovers waveform from quantized values.\"\"\"\n    mu = 2**mulaw_qc - 1\n    x = np.sign(wav) / mu * ((1 + mu) ** np.abs(wav) - 1)\n    return x\n\n\ndef encode_16bits(*, x: np.ndarray, **kwargs) -> np.ndarray:\n    return np.clip(x * 2**15, -(2**15), 2**15 - 1).astype(np.int16)\n\n\ndef quantize(*, x: np.ndarray, quantize_bits: int, **kwargs) -> np.ndarray:\n    \"\"\"Quantize a waveform to a given number of bits.\n\n    Args:\n        x (np.ndarray): Waveform to quantize. Must be normalized into the range `[-1, 1]`.\n        quantize_bits (int): Number of quantization bits.\n\n    Returns:\n        np.ndarray: Quantized waveform.\n    \"\"\"\n    return (x + 1.0) * (2**quantize_bits - 1) / 2\n\n\ndef dequantize(*, x, quantize_bits, **kwargs) -> np.ndarray:\n    \"\"\"Dequantize a waveform from the given number of bits.\"\"\"\n    return 2 * x / (2**quantize_bits - 1) - 1\n"
  },
  {
    "path": "TTS/utils/audio/processor.py",
    "content": "from typing import Dict, Tuple\n\nimport librosa\nimport numpy as np\nimport scipy.io.wavfile\nimport scipy.signal\nimport soundfile as sf\n\nfrom TTS.tts.utils.helpers import StandardScaler\nfrom TTS.utils.audio.numpy_transforms import compute_f0\n\n# pylint: disable=too-many-public-methods\n\n\nclass AudioProcessor(object):\n    \"\"\"Audio Processor for TTS.\n\n    Note:\n        All the class arguments are set to default values to enable a flexible initialization\n        of the class with the model config. They are not meaningful for all the arguments.\n\n    Args:\n        sample_rate (int, optional):\n            target audio sampling rate. Defaults to None.\n\n        resample (bool, optional):\n            enable/disable resampling of the audio clips when the target sampling rate does not match the original sampling rate. Defaults to False.\n\n        num_mels (int, optional):\n            number of melspectrogram dimensions. Defaults to None.\n\n        log_func (int, optional):\n            log exponent used for converting spectrogram aplitude to DB.\n\n        min_level_db (int, optional):\n            minimum db threshold for the computed melspectrograms. Defaults to None.\n\n        frame_shift_ms (int, optional):\n            milliseconds of frames between STFT columns. Defaults to None.\n\n        frame_length_ms (int, optional):\n            milliseconds of STFT window length. Defaults to None.\n\n        hop_length (int, optional):\n            number of frames between STFT columns. Used if ```frame_shift_ms``` is None. Defaults to None.\n\n        win_length (int, optional):\n            STFT window length. Used if ```frame_length_ms``` is None. Defaults to None.\n\n        ref_level_db (int, optional):\n            reference DB level to avoid background noise. In general <20DB corresponds to the air noise. Defaults to None.\n\n        fft_size (int, optional):\n            FFT window size for STFT. Defaults to 1024.\n\n        power (int, optional):\n            Exponent value applied to the spectrogram before GriffinLim. Defaults to None.\n\n        preemphasis (float, optional):\n            Preemphasis coefficient. Preemphasis is disabled if == 0.0. Defaults to 0.0.\n\n        signal_norm (bool, optional):\n            enable/disable signal normalization. Defaults to None.\n\n        symmetric_norm (bool, optional):\n            enable/disable symmetric normalization. If set True normalization is performed in the range [-k, k] else [0, k], Defaults to None.\n\n        max_norm (float, optional):\n            ```k``` defining the normalization range. Defaults to None.\n\n        mel_fmin (int, optional):\n            minimum filter frequency for computing melspectrograms. Defaults to None.\n\n        mel_fmax (int, optional):\n            maximum filter frequency for computing melspectrograms. Defaults to None.\n\n        pitch_fmin (int, optional):\n            minimum filter frequency for computing pitch. Defaults to None.\n\n        pitch_fmax (int, optional):\n            maximum filter frequency for computing pitch. Defaults to None.\n\n        spec_gain (int, optional):\n            gain applied when converting amplitude to DB. Defaults to 20.\n\n        stft_pad_mode (str, optional):\n            Padding mode for STFT. Defaults to 'reflect'.\n\n        clip_norm (bool, optional):\n            enable/disable clipping the our of range values in the normalized audio signal. Defaults to True.\n\n        griffin_lim_iters (int, optional):\n            Number of GriffinLim iterations. Defaults to None.\n\n        do_trim_silence (bool, optional):\n            enable/disable silence trimming when loading the audio signal. Defaults to False.\n\n        trim_db (int, optional):\n            DB threshold used for silence trimming. Defaults to 60.\n\n        do_sound_norm (bool, optional):\n            enable/disable signal normalization. Defaults to False.\n\n        do_amp_to_db_linear (bool, optional):\n            enable/disable amplitude to dB conversion of linear spectrograms. Defaults to True.\n\n        do_amp_to_db_mel (bool, optional):\n            enable/disable amplitude to dB conversion of mel spectrograms. Defaults to True.\n\n        do_rms_norm (bool, optional):\n            enable/disable RMS volume normalization when loading an audio file. Defaults to False.\n\n        db_level (int, optional):\n            dB level used for rms normalization. The range is -99 to 0. Defaults to None.\n\n        stats_path (str, optional):\n            Path to the computed stats file. Defaults to None.\n\n        verbose (bool, optional):\n            enable/disable logging. Defaults to True.\n\n    \"\"\"\n\n    def __init__(\n        self,\n        sample_rate=None,\n        resample=False,\n        num_mels=None,\n        log_func=\"np.log10\",\n        min_level_db=None,\n        frame_shift_ms=None,\n        frame_length_ms=None,\n        hop_length=None,\n        win_length=None,\n        ref_level_db=None,\n        fft_size=1024,\n        power=None,\n        preemphasis=0.0,\n        signal_norm=None,\n        symmetric_norm=None,\n        max_norm=None,\n        mel_fmin=None,\n        mel_fmax=None,\n        pitch_fmax=None,\n        pitch_fmin=None,\n        spec_gain=20,\n        stft_pad_mode=\"reflect\",\n        clip_norm=True,\n        griffin_lim_iters=None,\n        do_trim_silence=False,\n        trim_db=60,\n        do_sound_norm=False,\n        do_amp_to_db_linear=True,\n        do_amp_to_db_mel=True,\n        do_rms_norm=False,\n        db_level=None,\n        stats_path=None,\n        verbose=True,\n        **_,\n    ):\n        # setup class attributed\n        self.sample_rate = sample_rate\n        self.resample = resample\n        self.num_mels = num_mels\n        self.log_func = log_func\n        self.min_level_db = min_level_db or 0\n        self.frame_shift_ms = frame_shift_ms\n        self.frame_length_ms = frame_length_ms\n        self.ref_level_db = ref_level_db\n        self.fft_size = fft_size\n        self.power = power\n        self.preemphasis = preemphasis\n        self.griffin_lim_iters = griffin_lim_iters\n        self.signal_norm = signal_norm\n        self.symmetric_norm = symmetric_norm\n        self.mel_fmin = mel_fmin or 0\n        self.mel_fmax = mel_fmax\n        self.pitch_fmin = pitch_fmin\n        self.pitch_fmax = pitch_fmax\n        self.spec_gain = float(spec_gain)\n        self.stft_pad_mode = stft_pad_mode\n        self.max_norm = 1.0 if max_norm is None else float(max_norm)\n        self.clip_norm = clip_norm\n        self.do_trim_silence = do_trim_silence\n        self.trim_db = trim_db\n        self.do_sound_norm = do_sound_norm\n        self.do_amp_to_db_linear = do_amp_to_db_linear\n        self.do_amp_to_db_mel = do_amp_to_db_mel\n        self.do_rms_norm = do_rms_norm\n        self.db_level = db_level\n        self.stats_path = stats_path\n        # setup exp_func for db to amp conversion\n        if log_func == \"np.log\":\n            self.base = np.e\n        elif log_func == \"np.log10\":\n            self.base = 10\n        else:\n            raise ValueError(\" [!] unknown `log_func` value.\")\n        # setup stft parameters\n        if hop_length is None:\n            # compute stft parameters from given time values\n            self.hop_length, self.win_length = self._stft_parameters()\n        else:\n            # use stft parameters from config file\n            self.hop_length = hop_length\n            self.win_length = win_length\n        assert min_level_db != 0.0, \" [!] min_level_db is 0\"\n        assert (\n            self.win_length <= self.fft_size\n        ), f\" [!] win_length cannot be larger than fft_size - {self.win_length} vs {self.fft_size}\"\n        members = vars(self)\n        if verbose:\n            print(\" > Setting up Audio Processor...\")\n            for key, value in members.items():\n                print(\" | > {}:{}\".format(key, value))\n        # create spectrogram utils\n        self.mel_basis = self._build_mel_basis()\n        self.inv_mel_basis = np.linalg.pinv(self._build_mel_basis())\n        # setup scaler\n        if stats_path and signal_norm:\n            mel_mean, mel_std, linear_mean, linear_std, _ = self.load_stats(stats_path)\n            self.setup_scaler(mel_mean, mel_std, linear_mean, linear_std)\n            self.signal_norm = True\n            self.max_norm = None\n            self.clip_norm = None\n            self.symmetric_norm = None\n\n    @staticmethod\n    def init_from_config(config: \"Coqpit\", verbose=True):\n        if \"audio\" in config:\n            return AudioProcessor(verbose=verbose, **config.audio)\n        return AudioProcessor(verbose=verbose, **config)\n\n    ### setting up the parameters ###\n    def _build_mel_basis(\n        self,\n    ) -> np.ndarray:\n        \"\"\"Build melspectrogram basis.\n\n        Returns:\n            np.ndarray: melspectrogram basis.\n        \"\"\"\n        if self.mel_fmax is not None:\n            assert self.mel_fmax <= self.sample_rate // 2\n        return librosa.filters.mel(\n            self.sample_rate, self.fft_size, n_mels=self.num_mels, fmin=self.mel_fmin, fmax=self.mel_fmax\n        )\n\n    def _stft_parameters(\n        self,\n    ) -> Tuple[int, int]:\n        \"\"\"Compute the real STFT parameters from the time values.\n\n        Returns:\n            Tuple[int, int]: hop length and window length for STFT.\n        \"\"\"\n        factor = self.frame_length_ms / self.frame_shift_ms\n        assert (factor).is_integer(), \" [!] frame_shift_ms should divide frame_length_ms\"\n        hop_length = int(self.frame_shift_ms / 1000.0 * self.sample_rate)\n        win_length = int(hop_length * factor)\n        return hop_length, win_length\n\n    ### normalization ###\n    def normalize(self, S: np.ndarray) -> np.ndarray:\n        \"\"\"Normalize values into `[0, self.max_norm]` or `[-self.max_norm, self.max_norm]`\n\n        Args:\n            S (np.ndarray): Spectrogram to normalize.\n\n        Raises:\n            RuntimeError: Mean and variance is computed from incompatible parameters.\n\n        Returns:\n            np.ndarray: Normalized spectrogram.\n        \"\"\"\n        # pylint: disable=no-else-return\n        S = S.copy()\n        if self.signal_norm:\n            # mean-var scaling\n            if hasattr(self, \"mel_scaler\"):\n                if S.shape[0] == self.num_mels:\n                    return self.mel_scaler.transform(S.T).T\n                elif S.shape[0] == self.fft_size / 2:\n                    return self.linear_scaler.transform(S.T).T\n                else:\n                    raise RuntimeError(\" [!] Mean-Var stats does not match the given feature dimensions.\")\n            # range normalization\n            S -= self.ref_level_db  # discard certain range of DB assuming it is air noise\n            S_norm = (S - self.min_level_db) / (-self.min_level_db)\n            if self.symmetric_norm:\n                S_norm = ((2 * self.max_norm) * S_norm) - self.max_norm\n                if self.clip_norm:\n                    S_norm = np.clip(\n                        S_norm, -self.max_norm, self.max_norm  # pylint: disable=invalid-unary-operand-type\n                    )\n                return S_norm\n            else:\n                S_norm = self.max_norm * S_norm\n                if self.clip_norm:\n                    S_norm = np.clip(S_norm, 0, self.max_norm)\n                return S_norm\n        else:\n            return S\n\n    def denormalize(self, S: np.ndarray) -> np.ndarray:\n        \"\"\"Denormalize spectrogram values.\n\n        Args:\n            S (np.ndarray): Spectrogram to denormalize.\n\n        Raises:\n            RuntimeError: Mean and variance are incompatible.\n\n        Returns:\n            np.ndarray: Denormalized spectrogram.\n        \"\"\"\n        # pylint: disable=no-else-return\n        S_denorm = S.copy()\n        if self.signal_norm:\n            # mean-var scaling\n            if hasattr(self, \"mel_scaler\"):\n                if S_denorm.shape[0] == self.num_mels:\n                    return self.mel_scaler.inverse_transform(S_denorm.T).T\n                elif S_denorm.shape[0] == self.fft_size / 2:\n                    return self.linear_scaler.inverse_transform(S_denorm.T).T\n                else:\n                    raise RuntimeError(\" [!] Mean-Var stats does not match the given feature dimensions.\")\n            if self.symmetric_norm:\n                if self.clip_norm:\n                    S_denorm = np.clip(\n                        S_denorm, -self.max_norm, self.max_norm  # pylint: disable=invalid-unary-operand-type\n                    )\n                S_denorm = ((S_denorm + self.max_norm) * -self.min_level_db / (2 * self.max_norm)) + self.min_level_db\n                return S_denorm + self.ref_level_db\n            else:\n                if self.clip_norm:\n                    S_denorm = np.clip(S_denorm, 0, self.max_norm)\n                S_denorm = (S_denorm * -self.min_level_db / self.max_norm) + self.min_level_db\n                return S_denorm + self.ref_level_db\n        else:\n            return S_denorm\n\n    ### Mean-STD scaling ###\n    def load_stats(self, stats_path: str) -> Tuple[np.array, np.array, np.array, np.array, Dict]:\n        \"\"\"Loading mean and variance statistics from a `npy` file.\n\n        Args:\n            stats_path (str): Path to the `npy` file containing\n\n        Returns:\n            Tuple[np.array, np.array, np.array, np.array, Dict]: loaded statistics and the config used to\n                compute them.\n        \"\"\"\n        stats = np.load(stats_path, allow_pickle=True).item()  # pylint: disable=unexpected-keyword-arg\n        mel_mean = stats[\"mel_mean\"]\n        mel_std = stats[\"mel_std\"]\n        linear_mean = stats[\"linear_mean\"]\n        linear_std = stats[\"linear_std\"]\n        stats_config = stats[\"audio_config\"]\n        # check all audio parameters used for computing stats\n        skip_parameters = [\"griffin_lim_iters\", \"stats_path\", \"do_trim_silence\", \"ref_level_db\", \"power\"]\n        for key in stats_config.keys():\n            if key in skip_parameters:\n                continue\n            if key not in [\"sample_rate\", \"trim_db\"]:\n                assert (\n                    stats_config[key] == self.__dict__[key]\n                ), f\" [!] Audio param {key} does not match the value used for computing mean-var stats. {stats_config[key]} vs {self.__dict__[key]}\"\n        return mel_mean, mel_std, linear_mean, linear_std, stats_config\n\n    # pylint: disable=attribute-defined-outside-init\n    def setup_scaler(\n        self, mel_mean: np.ndarray, mel_std: np.ndarray, linear_mean: np.ndarray, linear_std: np.ndarray\n    ) -> None:\n        \"\"\"Initialize scaler objects used in mean-std normalization.\n\n        Args:\n            mel_mean (np.ndarray): Mean for melspectrograms.\n            mel_std (np.ndarray): STD for melspectrograms.\n            linear_mean (np.ndarray): Mean for full scale spectrograms.\n            linear_std (np.ndarray): STD for full scale spectrograms.\n        \"\"\"\n        self.mel_scaler = StandardScaler()\n        self.mel_scaler.set_stats(mel_mean, mel_std)\n        self.linear_scaler = StandardScaler()\n        self.linear_scaler.set_stats(linear_mean, linear_std)\n\n    ### DB and AMP conversion ###\n    # pylint: disable=no-self-use\n    def _amp_to_db(self, x: np.ndarray) -> np.ndarray:\n        \"\"\"Convert amplitude values to decibels.\n\n        Args:\n            x (np.ndarray): Amplitude spectrogram.\n\n        Returns:\n            np.ndarray: Decibels spectrogram.\n        \"\"\"\n        return self.spec_gain * _log(np.maximum(1e-5, x), self.base)\n\n    # pylint: disable=no-self-use\n    def _db_to_amp(self, x: np.ndarray) -> np.ndarray:\n        \"\"\"Convert decibels spectrogram to amplitude spectrogram.\n\n        Args:\n            x (np.ndarray): Decibels spectrogram.\n\n        Returns:\n            np.ndarray: Amplitude spectrogram.\n        \"\"\"\n        return _exp(x / self.spec_gain, self.base)\n\n    ### Preemphasis ###\n    def apply_preemphasis(self, x: np.ndarray) -> np.ndarray:\n        \"\"\"Apply pre-emphasis to the audio signal. Useful to reduce the correlation between neighbouring signal values.\n\n        Args:\n            x (np.ndarray): Audio signal.\n\n        Raises:\n            RuntimeError: Preemphasis coeff is set to 0.\n\n        Returns:\n            np.ndarray: Decorrelated audio signal.\n        \"\"\"\n        if self.preemphasis == 0:\n            raise RuntimeError(\" [!] Preemphasis is set 0.0.\")\n        return scipy.signal.lfilter([1, -self.preemphasis], [1], x)\n\n    def apply_inv_preemphasis(self, x: np.ndarray) -> np.ndarray:\n        \"\"\"Reverse pre-emphasis.\"\"\"\n        if self.preemphasis == 0:\n            raise RuntimeError(\" [!] Preemphasis is set 0.0.\")\n        return scipy.signal.lfilter([1], [1, -self.preemphasis], x)\n\n    ### SPECTROGRAMs ###\n    def _linear_to_mel(self, spectrogram: np.ndarray) -> np.ndarray:\n        \"\"\"Project a full scale spectrogram to a melspectrogram.\n\n        Args:\n            spectrogram (np.ndarray): Full scale spectrogram.\n\n        Returns:\n            np.ndarray: Melspectrogram\n        \"\"\"\n        return np.dot(self.mel_basis, spectrogram)\n\n    def _mel_to_linear(self, mel_spec: np.ndarray) -> np.ndarray:\n        \"\"\"Convert a melspectrogram to full scale spectrogram.\"\"\"\n        return np.maximum(1e-10, np.dot(self.inv_mel_basis, mel_spec))\n\n    def spectrogram(self, y: np.ndarray) -> np.ndarray:\n        \"\"\"Compute a spectrogram from a waveform.\n\n        Args:\n            y (np.ndarray): Waveform.\n\n        Returns:\n            np.ndarray: Spectrogram.\n        \"\"\"\n        if self.preemphasis != 0:\n            D = self._stft(self.apply_preemphasis(y))\n        else:\n            D = self._stft(y)\n        if self.do_amp_to_db_linear:\n            S = self._amp_to_db(np.abs(D))\n        else:\n            S = np.abs(D)\n        return self.normalize(S).astype(np.float32)\n\n    def melspectrogram(self, y: np.ndarray) -> np.ndarray:\n        \"\"\"Compute a melspectrogram from a waveform.\"\"\"\n        if self.preemphasis != 0:\n            D = self._stft(self.apply_preemphasis(y))\n        else:\n            D = self._stft(y)\n        if self.do_amp_to_db_mel:\n            S = self._amp_to_db(self._linear_to_mel(np.abs(D)))\n        else:\n            S = self._linear_to_mel(np.abs(D))\n        return self.normalize(S).astype(np.float32)\n\n    def inv_spectrogram(self, spectrogram: np.ndarray) -> np.ndarray:\n        \"\"\"Convert a spectrogram to a waveform using Griffi-Lim vocoder.\"\"\"\n        S = self.denormalize(spectrogram)\n        S = self._db_to_amp(S)\n        # Reconstruct phase\n        if self.preemphasis != 0:\n            return self.apply_inv_preemphasis(self._griffin_lim(S**self.power))\n        return self._griffin_lim(S**self.power)\n\n    def inv_melspectrogram(self, mel_spectrogram: np.ndarray) -> np.ndarray:\n        \"\"\"Convert a melspectrogram to a waveform using Griffi-Lim vocoder.\"\"\"\n        D = self.denormalize(mel_spectrogram)\n        S = self._db_to_amp(D)\n        S = self._mel_to_linear(S)  # Convert back to linear\n        if self.preemphasis != 0:\n            return self.apply_inv_preemphasis(self._griffin_lim(S**self.power))\n        return self._griffin_lim(S**self.power)\n\n    def out_linear_to_mel(self, linear_spec: np.ndarray) -> np.ndarray:\n        \"\"\"Convert a full scale linear spectrogram output of a network to a melspectrogram.\n\n        Args:\n            linear_spec (np.ndarray): Normalized full scale linear spectrogram.\n\n        Returns:\n            np.ndarray: Normalized melspectrogram.\n        \"\"\"\n        S = self.denormalize(linear_spec)\n        S = self._db_to_amp(S)\n        S = self._linear_to_mel(np.abs(S))\n        S = self._amp_to_db(S)\n        mel = self.normalize(S)\n        return mel\n\n    ### STFT and ISTFT ###\n    def _stft(self, y: np.ndarray) -> np.ndarray:\n        \"\"\"Librosa STFT wrapper.\n\n        Args:\n            y (np.ndarray): Audio signal.\n\n        Returns:\n            np.ndarray: Complex number array.\n        \"\"\"\n        return librosa.stft(\n            y=y,\n            n_fft=self.fft_size,\n            hop_length=self.hop_length,\n            win_length=self.win_length,\n            pad_mode=self.stft_pad_mode,\n            window=\"hann\",\n            center=True,\n        )\n\n    def _istft(self, y: np.ndarray) -> np.ndarray:\n        \"\"\"Librosa iSTFT wrapper.\"\"\"\n        return librosa.istft(y, hop_length=self.hop_length, win_length=self.win_length)\n\n    def _griffin_lim(self, S):\n        angles = np.exp(2j * np.pi * np.random.rand(*S.shape))\n        S_complex = np.abs(S).astype(np.complex)\n        y = self._istft(S_complex * angles)\n        if not np.isfinite(y).all():\n            print(\" [!] Waveform is not finite everywhere. Skipping the GL.\")\n            return np.array([0.0])\n        for _ in range(self.griffin_lim_iters):\n            angles = np.exp(1j * np.angle(self._stft(y)))\n            y = self._istft(S_complex * angles)\n        return y\n\n    def compute_stft_paddings(self, x, pad_sides=1):\n        \"\"\"Compute paddings used by Librosa's STFT. Compute right padding (final frame) or both sides padding\n        (first and final frames)\"\"\"\n        assert pad_sides in (1, 2)\n        pad = (x.shape[0] // self.hop_length + 1) * self.hop_length - x.shape[0]\n        if pad_sides == 1:\n            return 0, pad\n        return pad // 2, pad // 2 + pad % 2\n\n    def compute_f0(self, x: np.ndarray) -> np.ndarray:\n        \"\"\"Compute pitch (f0) of a waveform using the same parameters used for computing melspectrogram.\n\n        Args:\n            x (np.ndarray): Waveform.\n\n        Returns:\n            np.ndarray: Pitch.\n\n        Examples:\n            >>> WAV_FILE = filename = librosa.util.example_audio_file()\n            >>> from TTS.config import BaseAudioConfig\n            >>> from TTS.utils.audio import AudioProcessor\n            >>> conf = BaseAudioConfig(pitch_fmax=640, pitch_fmin=1)\n            >>> ap = AudioProcessor(**conf)\n            >>> wav = ap.load_wav(WAV_FILE, sr=ap.sample_rate)[:5 * ap.sample_rate]\n            >>> pitch = ap.compute_f0(wav)\n        \"\"\"\n        assert self.pitch_fmax is not None, \" [!] Set `pitch_fmax` before caling `compute_f0`.\"\n        assert self.pitch_fmin is not None, \" [!] Set `pitch_fmin` before caling `compute_f0`.\"\n        # align F0 length to the spectrogram length\n        if len(x) % self.hop_length == 0:\n            x = np.pad(x, (0, self.hop_length // 2), mode=self.stft_pad_mode)\n\n        f0 = compute_f0(\n            x=x,\n            pitch_fmax=self.pitch_fmax,\n            pitch_fmin=self.pitch_fmin,\n            hop_length=self.hop_length,\n            win_length=self.win_length,\n            sample_rate=self.sample_rate,\n            stft_pad_mode=self.stft_pad_mode,\n            center=True,\n        )\n\n        return f0\n\n    ### Audio Processing ###\n    def find_endpoint(self, wav: np.ndarray, min_silence_sec=0.8) -> int:\n        \"\"\"Find the last point without silence at the end of a audio signal.\n\n        Args:\n            wav (np.ndarray): Audio signal.\n            threshold_db (int, optional): Silence threshold in decibels. Defaults to -40.\n            min_silence_sec (float, optional): Ignore silences that are shorter then this in secs. Defaults to 0.8.\n\n        Returns:\n            int: Last point without silence.\n        \"\"\"\n        window_length = int(self.sample_rate * min_silence_sec)\n        hop_length = int(window_length / 4)\n        threshold = self._db_to_amp(-self.trim_db)\n        for x in range(hop_length, len(wav) - window_length, hop_length):\n            if np.max(wav[x : x + window_length]) < threshold:\n                return x + hop_length\n        return len(wav)\n\n    def trim_silence(self, wav):\n        \"\"\"Trim silent parts with a threshold and 0.01 sec margin\"\"\"\n        margin = int(self.sample_rate * 0.01)\n        wav = wav[margin:-margin]\n        return librosa.effects.trim(wav, top_db=self.trim_db, frame_length=self.win_length, hop_length=self.hop_length)[\n            0\n        ]\n\n    @staticmethod\n    def sound_norm(x: np.ndarray) -> np.ndarray:\n        \"\"\"Normalize the volume of an audio signal.\n\n        Args:\n            x (np.ndarray): Raw waveform.\n\n        Returns:\n            np.ndarray: Volume normalized waveform.\n        \"\"\"\n        return x / abs(x).max() * 0.95\n\n    @staticmethod\n    def _rms_norm(wav, db_level=-27):\n        r = 10 ** (db_level / 20)\n        a = np.sqrt((len(wav) * (r**2)) / np.sum(wav**2))\n        return wav * a\n\n    def rms_volume_norm(self, x: np.ndarray, db_level: float = None) -> np.ndarray:\n        \"\"\"Normalize the volume based on RMS of the signal.\n\n        Args:\n            x (np.ndarray): Raw waveform.\n\n        Returns:\n            np.ndarray: RMS normalized waveform.\n        \"\"\"\n        if db_level is None:\n            db_level = self.db_level\n        assert -99 <= db_level <= 0, \" [!] db_level should be between -99 and 0\"\n        wav = self._rms_norm(x, db_level)\n        return wav\n\n    ### save and load ###\n    def load_wav(self, filename: str, sr: int = None) -> np.ndarray:\n        \"\"\"Read a wav file using Librosa and optionally resample, silence trim, volume normalize.\n\n        Resampling slows down loading the file significantly. Therefore it is recommended to resample the file before.\n\n        Args:\n            filename (str): Path to the wav file.\n            sr (int, optional): Sampling rate for resampling. Defaults to None.\n\n        Returns:\n            np.ndarray: Loaded waveform.\n        \"\"\"\n        if self.resample:\n            # loading with resampling. It is significantly slower.\n            x, sr = librosa.load(filename, sr=self.sample_rate)\n        elif sr is None:\n            # SF is faster than librosa for loading files\n            x, sr = sf.read(filename)\n            assert self.sample_rate == sr, \"%s vs %s\" % (self.sample_rate, sr)\n        else:\n            x, sr = librosa.load(filename, sr=sr)\n        if self.do_trim_silence:\n            try:\n                x = self.trim_silence(x)\n            except ValueError:\n                print(f\" [!] File cannot be trimmed for silence - {filename}\")\n        if self.do_sound_norm:\n            x = self.sound_norm(x)\n        if self.do_rms_norm:\n            x = self.rms_volume_norm(x, self.db_level)\n        return x\n\n    def save_wav(self, wav: np.ndarray, path: str, sr: int = None) -> None:\n        \"\"\"Save a waveform to a file using Scipy.\n\n        Args:\n            wav (np.ndarray): Waveform to save.\n            path (str): Path to a output file.\n            sr (int, optional): Sampling rate used for saving to the file. Defaults to None.\n        \"\"\"\n        if self.do_rms_norm:\n            wav_norm = self.rms_volume_norm(wav, self.db_level) * 32767\n        else:\n            wav_norm = wav * (32767 / max(0.01, np.max(np.abs(wav))))\n\n        scipy.io.wavfile.write(path, sr if sr else self.sample_rate, wav_norm.astype(np.int16))\n\n    def get_duration(self, filename: str) -> float:\n        \"\"\"Get the duration of a wav file using Librosa.\n\n        Args:\n            filename (str): Path to the wav file.\n        \"\"\"\n        return librosa.get_duration(filename)\n\n    @staticmethod\n    def mulaw_encode(wav: np.ndarray, qc: int) -> np.ndarray:\n        mu = 2**qc - 1\n        # wav_abs = np.minimum(np.abs(wav), 1.0)\n        signal = np.sign(wav) * np.log(1 + mu * np.abs(wav)) / np.log(1.0 + mu)\n        # Quantize signal to the specified number of levels.\n        signal = (signal + 1) / 2 * mu + 0.5\n        return np.floor(\n            signal,\n        )\n\n    @staticmethod\n    def mulaw_decode(wav, qc):\n        \"\"\"Recovers waveform from quantized values.\"\"\"\n        mu = 2**qc - 1\n        x = np.sign(wav) / mu * ((1 + mu) ** np.abs(wav) - 1)\n        return x\n\n    @staticmethod\n    def encode_16bits(x):\n        return np.clip(x * 2**15, -(2**15), 2**15 - 1).astype(np.int16)\n\n    @staticmethod\n    def quantize(x: np.ndarray, bits: int) -> np.ndarray:\n        \"\"\"Quantize a waveform to a given number of bits.\n\n        Args:\n            x (np.ndarray): Waveform to quantize. Must be normalized into the range `[-1, 1]`.\n            bits (int): Number of quantization bits.\n\n        Returns:\n            np.ndarray: Quantized waveform.\n        \"\"\"\n        return (x + 1.0) * (2**bits - 1) / 2\n\n    @staticmethod\n    def dequantize(x, bits):\n        \"\"\"Dequantize a waveform from the given number of bits.\"\"\"\n        return 2 * x / (2**bits - 1) - 1\n\n\ndef _log(x, base):\n    if base == 10:\n        return np.log10(x)\n    return np.log(x)\n\n\ndef _exp(x, base):\n    if base == 10:\n        return np.power(10, x)\n    return np.exp(x)\n"
  },
  {
    "path": "TTS/utils/audio/torch_transforms.py",
    "content": "import librosa\nimport torch\nfrom torch import nn\n\n\nclass TorchSTFT(nn.Module):  # pylint: disable=abstract-method\n    \"\"\"Some of the audio processing funtions using Torch for faster batch processing.\n\n    Args:\n\n        n_fft (int):\n            FFT window size for STFT.\n\n        hop_length (int):\n            number of frames between STFT columns.\n\n        win_length (int, optional):\n            STFT window length.\n\n        pad_wav (bool, optional):\n            If True pad the audio with (n_fft - hop_length) / 2). Defaults to False.\n\n        window (str, optional):\n            The name of a function to create a window tensor that is applied/multiplied to each frame/window. Defaults to \"hann_window\"\n\n        sample_rate (int, optional):\n            target audio sampling rate. Defaults to None.\n\n        mel_fmin (int, optional):\n            minimum filter frequency for computing melspectrograms. Defaults to None.\n\n        mel_fmax (int, optional):\n            maximum filter frequency for computing melspectrograms. Defaults to None.\n\n        n_mels (int, optional):\n            number of melspectrogram dimensions. Defaults to None.\n\n        use_mel (bool, optional):\n            If True compute the melspectrograms otherwise. Defaults to False.\n\n        do_amp_to_db_linear (bool, optional):\n            enable/disable amplitude to dB conversion of linear spectrograms. Defaults to False.\n\n        spec_gain (float, optional):\n            gain applied when converting amplitude to DB. Defaults to 1.0.\n\n        power (float, optional):\n            Exponent for the magnitude spectrogram, e.g., 1 for energy, 2 for power, etc.  Defaults to None.\n\n        use_htk (bool, optional):\n            Use HTK formula in mel filter instead of Slaney.\n\n        mel_norm (None, 'slaney', or number, optional):\n            If 'slaney', divide the triangular mel weights by the width of the mel band\n            (area normalization).\n\n            If numeric, use `librosa.util.normalize` to normalize each filter by to unit l_p norm.\n            See `librosa.util.normalize` for a full description of supported norm values\n            (including `+-np.inf`).\n\n            Otherwise, leave all the triangles aiming for a peak value of 1.0. Defaults to \"slaney\".\n    \"\"\"\n\n    def __init__(\n        self,\n        n_fft,\n        hop_length,\n        win_length,\n        pad_wav=False,\n        window=\"hann_window\",\n        sample_rate=None,\n        mel_fmin=0,\n        mel_fmax=None,\n        n_mels=80,\n        use_mel=False,\n        do_amp_to_db=False,\n        spec_gain=1.0,\n        power=None,\n        use_htk=False,\n        mel_norm=\"slaney\",\n    ):\n        super().__init__()\n        self.n_fft = n_fft\n        self.hop_length = hop_length\n        self.win_length = win_length\n        self.pad_wav = pad_wav\n        self.sample_rate = sample_rate\n        self.mel_fmin = mel_fmin\n        self.mel_fmax = mel_fmax\n        self.n_mels = n_mels\n        self.use_mel = use_mel\n        self.do_amp_to_db = do_amp_to_db\n        self.spec_gain = spec_gain\n        self.power = power\n        self.use_htk = use_htk\n        self.mel_norm = mel_norm\n        self.window = nn.Parameter(getattr(torch, window)(win_length), requires_grad=False)\n        self.mel_basis = None\n        if use_mel:\n            self._build_mel_basis()\n\n    def __call__(self, x):\n        \"\"\"Compute spectrogram frames by torch based stft.\n\n        Args:\n            x (Tensor): input waveform\n\n        Returns:\n            Tensor: spectrogram frames.\n\n        Shapes:\n            x: [B x T] or [:math:`[B, 1, T]`]\n        \"\"\"\n        if x.ndim == 2:\n            x = x.unsqueeze(1)\n        if self.pad_wav:\n            padding = int((self.n_fft - self.hop_length) / 2)\n            x = torch.nn.functional.pad(x, (padding, padding), mode=\"reflect\")\n        # B x D x T x 2\n        o = torch.stft(\n            x.squeeze(1),\n            self.n_fft,\n            self.hop_length,\n            self.win_length,\n            self.window,\n            center=True,\n            pad_mode=\"reflect\",  # compatible with audio.py\n            normalized=False,\n            onesided=True,\n            return_complex=False,\n        )\n        M = o[:, :, :, 0]\n        P = o[:, :, :, 1]\n        S = torch.sqrt(torch.clamp(M**2 + P**2, min=1e-8))\n\n        if self.power is not None:\n            S = S**self.power\n\n        if self.use_mel:\n            S = torch.matmul(self.mel_basis.to(x), S)\n        if self.do_amp_to_db:\n            S = self._amp_to_db(S, spec_gain=self.spec_gain)\n        return S\n\n    def _build_mel_basis(self):\n        mel_basis = librosa.filters.mel(\n            self.sample_rate,\n            self.n_fft,\n            n_mels=self.n_mels,\n            fmin=self.mel_fmin,\n            fmax=self.mel_fmax,\n            htk=self.use_htk,\n            norm=self.mel_norm,\n        )\n        self.mel_basis = torch.from_numpy(mel_basis).float()\n\n    @staticmethod\n    def _amp_to_db(x, spec_gain=1.0):\n        return torch.log(torch.clamp(x, min=1e-5) * spec_gain)\n\n    @staticmethod\n    def _db_to_amp(x, spec_gain=1.0):\n        return torch.exp(x) / spec_gain\n"
  },
  {
    "path": "TTS/utils/callbacks.py",
    "content": "class TrainerCallback:\n    @staticmethod\n    def on_init_start(trainer) -> None:\n        if hasattr(trainer.model, \"module\"):\n            if hasattr(trainer.model.module, \"on_init_start\"):\n                trainer.model.module.on_init_start(trainer)\n        else:\n            if hasattr(trainer.model, \"on_init_start\"):\n                trainer.model.on_init_start(trainer)\n\n        if hasattr(trainer.criterion, \"on_init_start\"):\n            trainer.criterion.on_init_start(trainer)\n\n        if hasattr(trainer.optimizer, \"on_init_start\"):\n            trainer.optimizer.on_init_start(trainer)\n\n    @staticmethod\n    def on_init_end(trainer) -> None:\n        if hasattr(trainer.model, \"module\"):\n            if hasattr(trainer.model.module, \"on_init_end\"):\n                trainer.model.module.on_init_end(trainer)\n        else:\n            if hasattr(trainer.model, \"on_init_end\"):\n                trainer.model.on_init_end(trainer)\n\n        if hasattr(trainer.criterion, \"on_init_end\"):\n            trainer.criterion.on_init_end(trainer)\n\n        if hasattr(trainer.optimizer, \"on_init_end\"):\n            trainer.optimizer.on_init_end(trainer)\n\n    @staticmethod\n    def on_epoch_start(trainer) -> None:\n        if hasattr(trainer.model, \"module\"):\n            if hasattr(trainer.model.module, \"on_epoch_start\"):\n                trainer.model.module.on_epoch_start(trainer)\n        else:\n            if hasattr(trainer.model, \"on_epoch_start\"):\n                trainer.model.on_epoch_start(trainer)\n\n        if hasattr(trainer.criterion, \"on_epoch_start\"):\n            trainer.criterion.on_epoch_start(trainer)\n\n        if hasattr(trainer.optimizer, \"on_epoch_start\"):\n            trainer.optimizer.on_epoch_start(trainer)\n\n    @staticmethod\n    def on_epoch_end(trainer) -> None:\n        if hasattr(trainer.model, \"module\"):\n            if hasattr(trainer.model.module, \"on_epoch_end\"):\n                trainer.model.module.on_epoch_end(trainer)\n        else:\n            if hasattr(trainer.model, \"on_epoch_end\"):\n                trainer.model.on_epoch_end(trainer)\n\n        if hasattr(trainer.criterion, \"on_epoch_end\"):\n            trainer.criterion.on_epoch_end(trainer)\n\n        if hasattr(trainer.optimizer, \"on_epoch_end\"):\n            trainer.optimizer.on_epoch_end(trainer)\n\n    @staticmethod\n    def on_train_step_start(trainer) -> None:\n        if hasattr(trainer.model, \"module\"):\n            if hasattr(trainer.model.module, \"on_train_step_start\"):\n                trainer.model.module.on_train_step_start(trainer)\n        else:\n            if hasattr(trainer.model, \"on_train_step_start\"):\n                trainer.model.on_train_step_start(trainer)\n\n        if hasattr(trainer.criterion, \"on_train_step_start\"):\n            trainer.criterion.on_train_step_start(trainer)\n\n        if hasattr(trainer.optimizer, \"on_train_step_start\"):\n            trainer.optimizer.on_train_step_start(trainer)\n\n    @staticmethod\n    def on_train_step_end(trainer) -> None:\n        if hasattr(trainer.model, \"module\"):\n            if hasattr(trainer.model.module, \"on_train_step_end\"):\n                trainer.model.module.on_train_step_end(trainer)\n        else:\n            if hasattr(trainer.model, \"on_train_step_end\"):\n                trainer.model.on_train_step_end(trainer)\n\n        if hasattr(trainer.criterion, \"on_train_step_end\"):\n            trainer.criterion.on_train_step_end(trainer)\n\n        if hasattr(trainer.optimizer, \"on_train_step_end\"):\n            trainer.optimizer.on_train_step_end(trainer)\n\n    @staticmethod\n    def on_keyboard_interrupt(trainer) -> None:\n        if hasattr(trainer.model, \"module\"):\n            if hasattr(trainer.model.module, \"on_keyboard_interrupt\"):\n                trainer.model.module.on_keyboard_interrupt(trainer)\n        else:\n            if hasattr(trainer.model, \"on_keyboard_interrupt\"):\n                trainer.model.on_keyboard_interrupt(trainer)\n\n        if hasattr(trainer.criterion, \"on_keyboard_interrupt\"):\n            trainer.criterion.on_keyboard_interrupt(trainer)\n\n        if hasattr(trainer.optimizer, \"on_keyboard_interrupt\"):\n            trainer.optimizer.on_keyboard_interrupt(trainer)\n"
  },
  {
    "path": "TTS/utils/capacitron_optimizer.py",
    "content": "from typing import Generator\n\nfrom trainer.trainer_utils import get_optimizer\n\n\nclass CapacitronOptimizer:\n    \"\"\"Double optimizer class for the Capacitron model.\"\"\"\n\n    def __init__(self, config: dict, model_params: Generator) -> None:\n        self.primary_params, self.secondary_params = self.split_model_parameters(model_params)\n\n        optimizer_names = list(config.optimizer_params.keys())\n        optimizer_parameters = list(config.optimizer_params.values())\n\n        self.primary_optimizer = get_optimizer(\n            optimizer_names[0],\n            optimizer_parameters[0],\n            config.lr,\n            parameters=self.primary_params,\n        )\n\n        self.secondary_optimizer = get_optimizer(\n            optimizer_names[1],\n            self.extract_optimizer_parameters(optimizer_parameters[1]),\n            optimizer_parameters[1][\"lr\"],\n            parameters=self.secondary_params,\n        )\n\n        self.param_groups = self.primary_optimizer.param_groups\n\n    def first_step(self):\n        self.secondary_optimizer.step()\n        self.secondary_optimizer.zero_grad()\n        self.primary_optimizer.zero_grad()\n\n    def step(self):\n        # Update param groups to display the correct learning rate\n        self.param_groups = self.primary_optimizer.param_groups\n        self.primary_optimizer.step()\n\n    def zero_grad(self, set_to_none=False):\n        self.primary_optimizer.zero_grad(set_to_none)\n        self.secondary_optimizer.zero_grad(set_to_none)\n\n    def load_state_dict(self, state_dict):\n        self.primary_optimizer.load_state_dict(state_dict[0])\n        self.secondary_optimizer.load_state_dict(state_dict[1])\n\n    def state_dict(self):\n        return [self.primary_optimizer.state_dict(), self.secondary_optimizer.state_dict()]\n\n    @staticmethod\n    def split_model_parameters(model_params: Generator) -> list:\n        primary_params = []\n        secondary_params = []\n        for name, param in model_params:\n            if param.requires_grad:\n                if name == \"capacitron_vae_layer.beta\":\n                    secondary_params.append(param)\n                else:\n                    primary_params.append(param)\n        return [iter(primary_params), iter(secondary_params)]\n\n    @staticmethod\n    def extract_optimizer_parameters(params: dict) -> dict:\n        \"\"\"Extract parameters that are not the learning rate\"\"\"\n        return {k: v for k, v in params.items() if k != \"lr\"}\n"
  },
  {
    "path": "TTS/utils/distribute.py",
    "content": "# edited from https://github.com/fastai/imagenet-fast/blob/master/imagenet_nv/distributed.py\nimport torch\nimport torch.distributed as dist\n\n\ndef reduce_tensor(tensor, num_gpus):\n    rt = tensor.clone()\n    dist.all_reduce(rt, op=dist.reduce_op.SUM)\n    rt /= num_gpus\n    return rt\n\n\ndef init_distributed(rank, num_gpus, group_name, dist_backend, dist_url):\n    assert torch.cuda.is_available(), \"Distributed mode requires CUDA.\"\n\n    # Set cuda device so everything is done on the right GPU.\n    torch.cuda.set_device(rank % torch.cuda.device_count())\n\n    # Initialize distributed communication\n    dist.init_process_group(dist_backend, init_method=dist_url, world_size=num_gpus, rank=rank, group_name=group_name)\n"
  },
  {
    "path": "TTS/utils/download.py",
    "content": "# Adapted from https://github.com/pytorch/audio/\n\nimport hashlib\nimport logging\nimport os\nimport tarfile\nimport urllib\nimport urllib.request\nimport zipfile\nfrom os.path import expanduser\nfrom typing import Any, Iterable, List, Optional\n\nfrom torch.utils.model_zoo import tqdm\n\n\ndef stream_url(\n    url: str, start_byte: Optional[int] = None, block_size: int = 32 * 1024, progress_bar: bool = True\n) -> Iterable:\n    \"\"\"Stream url by chunk\n\n    Args:\n        url (str): Url.\n        start_byte (int or None, optional): Start streaming at that point (Default: ``None``).\n        block_size (int, optional): Size of chunks to stream (Default: ``32 * 1024``).\n        progress_bar (bool, optional): Display a progress bar (Default: ``True``).\n    \"\"\"\n\n    # If we already have the whole file, there is no need to download it again\n    req = urllib.request.Request(url, method=\"HEAD\")\n    with urllib.request.urlopen(req) as response:\n        url_size = int(response.info().get(\"Content-Length\", -1))\n    if url_size == start_byte:\n        return\n\n    req = urllib.request.Request(url)\n    if start_byte:\n        req.headers[\"Range\"] = \"bytes={}-\".format(start_byte)\n\n    with urllib.request.urlopen(req) as upointer, tqdm(\n        unit=\"B\",\n        unit_scale=True,\n        unit_divisor=1024,\n        total=url_size,\n        disable=not progress_bar,\n    ) as pbar:\n        num_bytes = 0\n        while True:\n            chunk = upointer.read(block_size)\n            if not chunk:\n                break\n            yield chunk\n            num_bytes += len(chunk)\n            pbar.update(len(chunk))\n\n\ndef download_url(\n    url: str,\n    download_folder: str,\n    filename: Optional[str] = None,\n    hash_value: Optional[str] = None,\n    hash_type: str = \"sha256\",\n    progress_bar: bool = True,\n    resume: bool = False,\n) -> None:\n    \"\"\"Download file to disk.\n\n    Args:\n        url (str): Url.\n        download_folder (str): Folder to download file.\n        filename (str or None, optional): Name of downloaded file. If None, it is inferred from the url\n            (Default: ``None``).\n        hash_value (str or None, optional): Hash for url (Default: ``None``).\n        hash_type (str, optional): Hash type, among \"sha256\" and \"md5\" (Default: ``\"sha256\"``).\n        progress_bar (bool, optional): Display a progress bar (Default: ``True``).\n        resume (bool, optional): Enable resuming download (Default: ``False``).\n    \"\"\"\n\n    req = urllib.request.Request(url, method=\"HEAD\")\n    req_info = urllib.request.urlopen(req).info()  # pylint: disable=consider-using-with\n\n    # Detect filename\n    filename = filename or req_info.get_filename() or os.path.basename(url)\n    filepath = os.path.join(download_folder, filename)\n    if resume and os.path.exists(filepath):\n        mode = \"ab\"\n        local_size: Optional[int] = os.path.getsize(filepath)\n\n    elif not resume and os.path.exists(filepath):\n        raise RuntimeError(\"{} already exists. Delete the file manually and retry.\".format(filepath))\n    else:\n        mode = \"wb\"\n        local_size = None\n\n    if hash_value and local_size == int(req_info.get(\"Content-Length\", -1)):\n        with open(filepath, \"rb\") as file_obj:\n            if validate_file(file_obj, hash_value, hash_type):\n                return\n        raise RuntimeError(\"The hash of {} does not match. Delete the file manually and retry.\".format(filepath))\n\n    with open(filepath, mode) as fpointer:\n        for chunk in stream_url(url, start_byte=local_size, progress_bar=progress_bar):\n            fpointer.write(chunk)\n\n    with open(filepath, \"rb\") as file_obj:\n        if hash_value and not validate_file(file_obj, hash_value, hash_type):\n            raise RuntimeError(\"The hash of {} does not match. Delete the file manually and retry.\".format(filepath))\n\n\ndef validate_file(file_obj: Any, hash_value: str, hash_type: str = \"sha256\") -> bool:\n    \"\"\"Validate a given file object with its hash.\n\n    Args:\n        file_obj: File object to read from.\n        hash_value (str): Hash for url.\n        hash_type (str, optional): Hash type, among \"sha256\" and \"md5\" (Default: ``\"sha256\"``).\n\n    Returns:\n        bool: return True if its a valid file, else False.\n    \"\"\"\n\n    if hash_type == \"sha256\":\n        hash_func = hashlib.sha256()\n    elif hash_type == \"md5\":\n        hash_func = hashlib.md5()\n    else:\n        raise ValueError\n\n    while True:\n        # Read by chunk to avoid filling memory\n        chunk = file_obj.read(1024**2)\n        if not chunk:\n            break\n        hash_func.update(chunk)\n\n    return hash_func.hexdigest() == hash_value\n\n\ndef extract_archive(from_path: str, to_path: Optional[str] = None, overwrite: bool = False) -> List[str]:\n    \"\"\"Extract archive.\n    Args:\n        from_path (str): the path of the archive.\n        to_path (str or None, optional): the root path of the extraced files (directory of from_path)\n            (Default: ``None``)\n        overwrite (bool, optional): overwrite existing files (Default: ``False``)\n\n    Returns:\n        list: List of paths to extracted files even if not overwritten.\n    \"\"\"\n\n    if to_path is None:\n        to_path = os.path.dirname(from_path)\n\n    try:\n        with tarfile.open(from_path, \"r\") as tar:\n            logging.info(\"Opened tar file %s.\", from_path)\n            files = []\n            for file_ in tar:  # type: Any\n                file_path = os.path.join(to_path, file_.name)\n                if file_.isfile():\n                    files.append(file_path)\n                    if os.path.exists(file_path):\n                        logging.info(\"%s already extracted.\", file_path)\n                        if not overwrite:\n                            continue\n                tar.extract(file_, to_path)\n            return files\n    except tarfile.ReadError:\n        pass\n\n    try:\n        with zipfile.ZipFile(from_path, \"r\") as zfile:\n            logging.info(\"Opened zip file %s.\", from_path)\n            files = zfile.namelist()\n            for file_ in files:\n                file_path = os.path.join(to_path, file_)\n                if os.path.exists(file_path):\n                    logging.info(\"%s already extracted.\", file_path)\n                    if not overwrite:\n                        continue\n                zfile.extract(file_, to_path)\n        return files\n    except zipfile.BadZipFile:\n        pass\n\n    raise NotImplementedError(\" > [!] only supports tar.gz, tgz, and zip achives.\")\n\n\ndef download_kaggle_dataset(dataset_path: str, dataset_name: str, output_path: str):\n    \"\"\"Download dataset from kaggle.\n    Args:\n        dataset_path (str):\n        This the kaggle link to the dataset. for example vctk is 'mfekadu/english-multispeaker-corpus-for-voice-cloning'\n        dataset_name (str): Name of the folder the dataset will be saved in.\n        output_path (str): Path of the location you want the dataset folder to be saved to.\n    \"\"\"\n    data_path = os.path.join(output_path, dataset_name)\n    try:\n        import kaggle  # pylint: disable=import-outside-toplevel\n\n        kaggle.api.authenticate()\n        print(f\"\"\"\\nDownloading {dataset_name}...\"\"\")\n        kaggle.api.dataset_download_files(dataset_path, path=data_path, unzip=True)\n    except OSError:\n        print(\n            f\"\"\"[!] in order to download kaggle datasets, you need to have a kaggle api token stored in your {os.path.join(expanduser('~'), '.kaggle/kaggle.json')}\"\"\"\n        )\n"
  },
  {
    "path": "TTS/utils/downloaders.py",
    "content": "import os\nfrom typing import Optional\n\nfrom TTS.utils.download import download_kaggle_dataset, download_url, extract_archive\n\n\ndef download_ljspeech(path: str):\n    \"\"\"Download and extract LJSpeech dataset\n\n    Args:\n        path (str): path to the directory where the dataset will be stored.\n    \"\"\"\n    os.makedirs(path, exist_ok=True)\n    url = \"https://data.keithito.com/data/speech/LJSpeech-1.1.tar.bz2\"\n    download_url(url, path)\n    basename = os.path.basename(url)\n    archive = os.path.join(path, basename)\n    print(\" > Extracting archive file...\")\n    extract_archive(archive)\n\n\ndef download_vctk(path: str, use_kaggle: Optional[bool] = False):\n    \"\"\"Download and extract VCTK dataset.\n\n    Args:\n        path (str): path to the directory where the dataset will be stored.\n\n        use_kaggle (bool, optional): Downloads vctk dataset from kaggle. Is generally faster. Defaults to False.\n    \"\"\"\n    if use_kaggle:\n        download_kaggle_dataset(\"mfekadu/english-multispeaker-corpus-for-voice-cloning\", \"VCTK\", path)\n    else:\n        os.makedirs(path, exist_ok=True)\n        url = \"https://datashare.ed.ac.uk/bitstream/handle/10283/3443/VCTK-Corpus-0.92.zip\"\n        download_url(url, path)\n        basename = os.path.basename(url)\n        archive = os.path.join(path, basename)\n        print(\" > Extracting archive file...\")\n        extract_archive(archive)\n\n\ndef download_tweb(path: str):\n    \"\"\"Download and extract Tweb dataset\n\n    Args:\n        path (str): Path to the directory where the dataset will be stored.\n    \"\"\"\n    download_kaggle_dataset(\"bryanpark/the-world-english-bible-speech-dataset\", \"TWEB\", path)\n\n\ndef download_libri_tts(path: str, subset: Optional[str] = \"all\"):\n    \"\"\"Download and extract libri tts dataset.\n\n    Args:\n        path (str): Path to the directory where the dataset will be stored.\n\n        subset (str, optional): Name of the subset to download. If you only want to download a certain\n        portion specify it here. Defaults to 'all'.\n    \"\"\"\n\n    subset_dict = {\n        \"libri-tts-clean-100\": \"http://www.openslr.org/resources/60/train-clean-100.tar.gz\",\n        \"libri-tts-clean-360\": \"http://www.openslr.org/resources/60/train-clean-360.tar.gz\",\n        \"libri-tts-other-500\": \"http://www.openslr.org/resources/60/train-other-500.tar.gz\",\n        \"libri-tts-dev-clean\": \"http://www.openslr.org/resources/60/dev-clean.tar.gz\",\n        \"libri-tts-dev-other\": \"http://www.openslr.org/resources/60/dev-other.tar.gz\",\n        \"libri-tts-test-clean\": \"http://www.openslr.org/resources/60/test-clean.tar.gz\",\n        \"libri-tts-test-other\": \"http://www.openslr.org/resources/60/test-other.tar.gz\",\n    }\n\n    os.makedirs(path, exist_ok=True)\n    if subset == \"all\":\n        for sub, val in subset_dict.items():\n            print(f\" > Downloading {sub}...\")\n            download_url(val, path)\n            basename = os.path.basename(val)\n            archive = os.path.join(path, basename)\n            print(\" > Extracting archive file...\")\n            extract_archive(archive)\n        print(\" > All subsets downloaded\")\n    else:\n        url = subset_dict[subset]\n        download_url(url, path)\n        basename = os.path.basename(url)\n        archive = os.path.join(path, basename)\n        print(\" > Extracting archive file...\")\n        extract_archive(archive)\n\n\ndef download_thorsten_de(path: str):\n    \"\"\"Download and extract Thorsten german male voice dataset.\n\n    Args:\n        path (str): Path to the directory where the dataset will be stored.\n    \"\"\"\n    os.makedirs(path, exist_ok=True)\n    url = \"https://www.openslr.org/resources/95/thorsten-de_v02.tgz\"\n    download_url(url, path)\n    basename = os.path.basename(url)\n    archive = os.path.join(path, basename)\n    print(\" > Extracting archive file...\")\n    extract_archive(archive)\n\n\ndef download_mailabs(path: str, language: str = \"english\"):\n    \"\"\"Download and extract Mailabs dataset.\n\n    Args:\n        path (str): Path to the directory where the dataset will be stored.\n\n        language (str): Language subset to download. Defaults to english.\n    \"\"\"\n    language_dict = {\n        \"english\": \"https://data.solak.de/data/Training/stt_tts/en_US.tgz\",\n        \"german\": \"https://data.solak.de/data/Training/stt_tts/de_DE.tgz\",\n        \"french\": \"https://data.solak.de/data/Training/stt_tts/fr_FR.tgz\",\n        \"italian\": \"https://data.solak.de/data/Training/stt_tts/it_IT.tgz\",\n        \"spanish\": \"https://data.solak.de/data/Training/stt_tts/es_ES.tgz\",\n    }\n    os.makedirs(path, exist_ok=True)\n    url = language_dict[language]\n    download_url(url, path)\n    basename = os.path.basename(url)\n    archive = os.path.join(path, basename)\n    print(\" > Extracting archive file...\")\n    extract_archive(archive)\n"
  },
  {
    "path": "TTS/utils/generic_utils.py",
    "content": "# -*- coding: utf-8 -*-\nimport datetime\nimport importlib\nimport os\nimport re\nimport subprocess\nimport sys\nfrom pathlib import Path\nfrom typing import Dict\n\nimport fsspec\nimport torch\n\n\ndef to_cuda(x: torch.Tensor) -> torch.Tensor:\n    if x is None:\n        return None\n    if torch.is_tensor(x):\n        x = x.contiguous()\n        if torch.cuda.is_available():\n            x = x.cuda(non_blocking=True)\n    return x\n\n\ndef get_cuda():\n    use_cuda = torch.cuda.is_available()\n    device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n    return use_cuda, device\n\n\ndef get_git_branch():\n    try:\n        out = subprocess.check_output([\"git\", \"branch\"]).decode(\"utf8\")\n        current = next(line for line in out.split(\"\\n\") if line.startswith(\"*\"))\n        current.replace(\"* \", \"\")\n    except subprocess.CalledProcessError:\n        current = \"inside_docker\"\n    except FileNotFoundError:\n        current = \"unknown\"\n    return current\n\n\ndef get_commit_hash():\n    \"\"\"https://stackoverflow.com/questions/14989858/get-the-current-git-hash-in-a-python-script\"\"\"\n    # try:\n    #     subprocess.check_output(['git', 'diff-index', '--quiet',\n    #                              'HEAD'])  # Verify client is clean\n    # except:\n    #     raise RuntimeError(\n    #         \" !! Commit before training to get the commit hash.\")\n    try:\n        commit = subprocess.check_output([\"git\", \"rev-parse\", \"--short\", \"HEAD\"]).decode().strip()\n    # Not copying .git folder into docker container\n    except (subprocess.CalledProcessError, FileNotFoundError):\n        commit = \"0000000\"\n    return commit\n\n\ndef get_experiment_folder_path(root_path, model_name):\n    \"\"\"Get an experiment folder path with the current date and time\"\"\"\n    date_str = datetime.datetime.now().strftime(\"%B-%d-%Y_%I+%M%p\")\n    commit_hash = get_commit_hash()\n    output_folder = os.path.join(root_path, model_name + \"-\" + date_str + \"-\" + commit_hash)\n    return output_folder\n\n\ndef remove_experiment_folder(experiment_path):\n    \"\"\"Check folder if there is a checkpoint, otherwise remove the folder\"\"\"\n    fs = fsspec.get_mapper(experiment_path).fs\n    checkpoint_files = fs.glob(experiment_path + \"/*.pth\")\n    if not checkpoint_files:\n        if fs.exists(experiment_path):\n            fs.rm(experiment_path, recursive=True)\n            print(\" ! Run is removed from {}\".format(experiment_path))\n    else:\n        print(\" ! Run is kept in {}\".format(experiment_path))\n\n\ndef count_parameters(model):\n    r\"\"\"Count number of trainable parameters in a network\"\"\"\n    return sum(p.numel() for p in model.parameters() if p.requires_grad)\n\n\ndef to_camel(text):\n    text = text.capitalize()\n    text = re.sub(r\"(?!^)_([a-zA-Z])\", lambda m: m.group(1).upper(), text)\n    text = text.replace(\"Tts\", \"TTS\")\n    text = text.replace(\"vc\", \"VC\")\n    return text\n\n\ndef find_module(module_path: str, module_name: str) -> object:\n    module_name = module_name.lower()\n    module = importlib.import_module(module_path + \".\" + module_name)\n    class_name = to_camel(module_name)\n    return getattr(module, class_name)\n\n\ndef import_class(module_path: str) -> object:\n    \"\"\"Import a class from a module path.\n\n    Args:\n        module_path (str): The module path of the class.\n\n    Returns:\n        object: The imported class.\n    \"\"\"\n    class_name = module_path.split(\".\")[-1]\n    module_path = \".\".join(module_path.split(\".\")[:-1])\n    module = importlib.import_module(module_path)\n    return getattr(module, class_name)\n\n\ndef get_import_path(obj: object) -> str:\n    \"\"\"Get the import path of a class.\n\n    Args:\n        obj (object): The class object.\n\n    Returns:\n        str: The import path of the class.\n    \"\"\"\n    return \".\".join([type(obj).__module__, type(obj).__name__])\n\n\ndef get_user_data_dir(appname):\n    if sys.platform == \"win32\":\n        import winreg  # pylint: disable=import-outside-toplevel\n\n        key = winreg.OpenKey(\n            winreg.HKEY_CURRENT_USER, r\"Software\\Microsoft\\Windows\\CurrentVersion\\Explorer\\Shell Folders\"\n        )\n        dir_, _ = winreg.QueryValueEx(key, \"Local AppData\")\n        ans = Path(dir_).resolve(strict=False)\n    elif sys.platform == \"darwin\":\n        ans = Path(\"~/Library/Application Support/\").expanduser()\n    else:\n        ans = Path.home().joinpath(\".local/share\")\n    return ans.joinpath(appname)\n\n\ndef set_init_dict(model_dict, checkpoint_state, c):\n    # Partial initialization: if there is a mismatch with new and old layer, it is skipped.\n    for k, v in checkpoint_state.items():\n        if k not in model_dict:\n            print(\" | > Layer missing in the model definition: {}\".format(k))\n    # 1. filter out unnecessary keys\n    pretrained_dict = {k: v for k, v in checkpoint_state.items() if k in model_dict}\n    # 2. filter out different size layers\n    pretrained_dict = {k: v for k, v in pretrained_dict.items() if v.numel() == model_dict[k].numel()}\n    # 3. skip reinit layers\n    if c.has(\"reinit_layers\") and c.reinit_layers is not None:\n        for reinit_layer_name in c.reinit_layers:\n            pretrained_dict = {k: v for k, v in pretrained_dict.items() if reinit_layer_name not in k}\n    # 4. overwrite entries in the existing state dict\n    model_dict.update(pretrained_dict)\n    print(\" | > {} / {} layers are restored.\".format(len(pretrained_dict), len(model_dict)))\n    return model_dict\n\n\ndef format_aux_input(def_args: Dict, kwargs: Dict) -> Dict:\n    \"\"\"Format kwargs to hande auxilary inputs to models.\n\n    Args:\n        def_args (Dict): A dictionary of argument names and their default values if not defined in `kwargs`.\n        kwargs (Dict): A `dict` or `kwargs` that includes auxilary inputs to the model.\n\n    Returns:\n        Dict: arguments with formatted auxilary inputs.\n    \"\"\"\n    kwargs = kwargs.copy()\n    for name in def_args:\n        if name not in kwargs or kwargs[name] is None:\n            kwargs[name] = def_args[name]\n    return kwargs\n\n\nclass KeepAverage:\n    def __init__(self):\n        self.avg_values = {}\n        self.iters = {}\n\n    def __getitem__(self, key):\n        return self.avg_values[key]\n\n    def items(self):\n        return self.avg_values.items()\n\n    def add_value(self, name, init_val=0, init_iter=0):\n        self.avg_values[name] = init_val\n        self.iters[name] = init_iter\n\n    def update_value(self, name, value, weighted_avg=False):\n        if name not in self.avg_values:\n            # add value if not exist before\n            self.add_value(name, init_val=value)\n        else:\n            # else update existing value\n            if weighted_avg:\n                self.avg_values[name] = 0.99 * self.avg_values[name] + 0.01 * value\n                self.iters[name] += 1\n            else:\n                self.avg_values[name] = self.avg_values[name] * self.iters[name] + value\n                self.iters[name] += 1\n                self.avg_values[name] /= self.iters[name]\n\n    def add_values(self, name_dict):\n        for key, value in name_dict.items():\n            self.add_value(key, init_val=value)\n\n    def update_values(self, value_dict):\n        for key, value in value_dict.items():\n            self.update_value(key, value)\n"
  },
  {
    "path": "TTS/utils/io.py",
    "content": "import datetime\nimport json\nimport os\nimport pickle as pickle_tts\nimport shutil\nfrom typing import Any, Callable, Dict, Union\n\nimport fsspec\nimport torch\nfrom coqpit import Coqpit\n\nfrom TTS.utils.generic_utils import get_user_data_dir\n\n\nclass RenamingUnpickler(pickle_tts.Unpickler):\n    \"\"\"Overload default pickler to solve module renaming problem\"\"\"\n\n    def find_class(self, module, name):\n        return super().find_class(module.replace(\"mozilla_voice_tts\", \"TTS\"), name)\n\n\nclass AttrDict(dict):\n    \"\"\"A custom dict which converts dict keys\n    to class attributes\"\"\"\n\n    def __init__(self, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.__dict__ = self\n\n\ndef copy_model_files(config: Coqpit, out_path, new_fields=None):\n    \"\"\"Copy config.json and other model files to training folder and add\n    new fields.\n\n    Args:\n        config (Coqpit): Coqpit config defining the training run.\n        out_path (str): output path to copy the file.\n        new_fields (dict): new fileds to be added or edited\n            in the config file.\n    \"\"\"\n    copy_config_path = os.path.join(out_path, \"config.json\")\n    # add extra information fields\n    if new_fields:\n        config.update(new_fields, allow_new=True)\n    # TODO: Revert to config.save_json() once Coqpit supports arbitrary paths.\n    with fsspec.open(copy_config_path, \"w\", encoding=\"utf8\") as f:\n        json.dump(config.to_dict(), f, indent=4)\n\n    # copy model stats file if available\n    if config.audio.stats_path is not None:\n        copy_stats_path = os.path.join(out_path, \"scale_stats.npy\")\n        filesystem = fsspec.get_mapper(copy_stats_path).fs\n        if not filesystem.exists(copy_stats_path):\n            with fsspec.open(config.audio.stats_path, \"rb\") as source_file:\n                with fsspec.open(copy_stats_path, \"wb\") as target_file:\n                    shutil.copyfileobj(source_file, target_file)\n\n\ndef load_fsspec(\n    path: str,\n    map_location: Union[str, Callable, torch.device, Dict[Union[str, torch.device], Union[str, torch.device]]] = None,\n    cache: bool = True,\n    **kwargs,\n) -> Any:\n    \"\"\"Like torch.load but can load from other locations (e.g. s3:// , gs://).\n\n    Args:\n        path: Any path or url supported by fsspec.\n        map_location: torch.device or str.\n        cache: If True, cache a remote file locally for subsequent calls. It is cached under `get_user_data_dir()/tts_cache`. Defaults to True.\n        **kwargs: Keyword arguments forwarded to torch.load.\n\n    Returns:\n        Object stored in path.\n    \"\"\"\n    is_local = os.path.isdir(path) or os.path.isfile(path)\n    if cache and not is_local:\n        with fsspec.open(\n            f\"filecache::{path}\",\n            filecache={\"cache_storage\": str(get_user_data_dir(\"tts_cache\"))},\n            mode=\"rb\",\n        ) as f:\n            return torch.load(f, map_location=map_location, **kwargs)\n    else:\n        with fsspec.open(path, \"rb\") as f:\n            return torch.load(f, map_location=map_location, **kwargs)\n\n\ndef load_checkpoint(\n    model, checkpoint_path, use_cuda=False, eval=False, cache=False\n):  # pylint: disable=redefined-builtin\n    try:\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), cache=cache)\n    except ModuleNotFoundError:\n        pickle_tts.Unpickler = RenamingUnpickler\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), pickle_module=pickle_tts, cache=cache)\n    model.load_state_dict(state[\"model\"])\n    if use_cuda:\n        model.cuda()\n    if eval:\n        model.eval()\n    return model, state\n\n\ndef save_fsspec(state: Any, path: str, **kwargs):\n    \"\"\"Like torch.save but can save to other locations (e.g. s3:// , gs://).\n\n    Args:\n        state: State object to save\n        path: Any path or url supported by fsspec.\n        **kwargs: Keyword arguments forwarded to torch.save.\n    \"\"\"\n    with fsspec.open(path, \"wb\") as f:\n        torch.save(state, f, **kwargs)\n\n\ndef save_model(config, model, optimizer, scaler, current_step, epoch, output_path, **kwargs):\n    if hasattr(model, \"module\"):\n        model_state = model.module.state_dict()\n    else:\n        model_state = model.state_dict()\n    if isinstance(optimizer, list):\n        optimizer_state = [optim.state_dict() for optim in optimizer]\n    elif optimizer.__class__.__name__ == \"CapacitronOptimizer\":\n        optimizer_state = [optimizer.primary_optimizer.state_dict(), optimizer.secondary_optimizer.state_dict()]\n    else:\n        optimizer_state = optimizer.state_dict() if optimizer is not None else None\n\n    if isinstance(scaler, list):\n        scaler_state = [s.state_dict() for s in scaler]\n    else:\n        scaler_state = scaler.state_dict() if scaler is not None else None\n\n    if isinstance(config, Coqpit):\n        config = config.to_dict()\n\n    state = {\n        \"config\": config,\n        \"model\": model_state,\n        \"optimizer\": optimizer_state,\n        \"scaler\": scaler_state,\n        \"step\": current_step,\n        \"epoch\": epoch,\n        \"date\": datetime.date.today().strftime(\"%B %d, %Y\"),\n    }\n    state.update(kwargs)\n    save_fsspec(state, output_path)\n\n\ndef save_checkpoint(\n    config,\n    model,\n    optimizer,\n    scaler,\n    current_step,\n    epoch,\n    output_folder,\n    **kwargs,\n):\n    file_name = \"checkpoint_{}.pth\".format(current_step)\n    checkpoint_path = os.path.join(output_folder, file_name)\n    print(\"\\n > CHECKPOINT : {}\".format(checkpoint_path))\n    save_model(\n        config,\n        model,\n        optimizer,\n        scaler,\n        current_step,\n        epoch,\n        checkpoint_path,\n        **kwargs,\n    )\n\n\ndef save_best_model(\n    current_loss,\n    best_loss,\n    config,\n    model,\n    optimizer,\n    scaler,\n    current_step,\n    epoch,\n    out_path,\n    keep_all_best=False,\n    keep_after=10000,\n    **kwargs,\n):\n    if current_loss < best_loss:\n        best_model_name = f\"best_model_{current_step}.pth\"\n        checkpoint_path = os.path.join(out_path, best_model_name)\n        print(\" > BEST MODEL : {}\".format(checkpoint_path))\n        save_model(\n            config,\n            model,\n            optimizer,\n            scaler,\n            current_step,\n            epoch,\n            checkpoint_path,\n            model_loss=current_loss,\n            **kwargs,\n        )\n        fs = fsspec.get_mapper(out_path).fs\n        # only delete previous if current is saved successfully\n        if not keep_all_best or (current_step < keep_after):\n            model_names = fs.glob(os.path.join(out_path, \"best_model*.pth\"))\n            for model_name in model_names:\n                if os.path.basename(model_name) != best_model_name:\n                    fs.rm(model_name)\n        # create a shortcut which always points to the currently best model\n        shortcut_name = \"best_model.pth\"\n        shortcut_path = os.path.join(out_path, shortcut_name)\n        fs.copy(checkpoint_path, shortcut_path)\n        best_loss = current_loss\n    return best_loss\n"
  },
  {
    "path": "TTS/utils/manage.py",
    "content": "import json\nimport os\nimport zipfile\nfrom pathlib import Path\nfrom shutil import copyfile, rmtree\nfrom typing import Dict, Tuple\n\nimport requests\nfrom tqdm import tqdm\n\nfrom TTS.config import load_config\nfrom TTS.utils.generic_utils import get_user_data_dir\n\nLICENSE_URLS = {\n    \"cc by-nc-nd 4.0\": \"https://creativecommons.org/licenses/by-nc-nd/4.0/\",\n    \"mpl\": \"https://www.mozilla.org/en-US/MPL/2.0/\",\n    \"mpl2\": \"https://www.mozilla.org/en-US/MPL/2.0/\",\n    \"mpl 2.0\": \"https://www.mozilla.org/en-US/MPL/2.0/\",\n    \"mit\": \"https://choosealicense.com/licenses/mit/\",\n    \"apache 2.0\": \"https://choosealicense.com/licenses/apache-2.0/\",\n    \"apache2\": \"https://choosealicense.com/licenses/apache-2.0/\",\n    \"cc-by-sa 4.0\": \"https://creativecommons.org/licenses/by-sa/4.0/\",\n}\n\n\nclass ModelManager(object):\n    \"\"\"Manage TTS models defined in .models.json.\n    It provides an interface to list and download\n    models defines in '.model.json'\n\n    Models are downloaded under '.TTS' folder in the user's\n    home path.\n\n    Args:\n        models_file (str): path to .model.json file. Defaults to None.\n        output_prefix (str): prefix to `tts` to download models. Defaults to None\n        progress_bar (bool): print a progress bar when donwloading a file. Defaults to False.\n        verbose (bool): print info. Defaults to True.\n    \"\"\"\n\n    def __init__(self, models_file=None, output_prefix=None, progress_bar=False, verbose=True):\n        super().__init__()\n        self.progress_bar = progress_bar\n        self.verbose = verbose\n        if output_prefix is None:\n            self.output_prefix = get_user_data_dir(\"tts\")\n        else:\n            self.output_prefix = os.path.join(output_prefix, \"tts\")\n        self.models_dict = None\n        if models_file is not None:\n            self.read_models_file(models_file)\n        else:\n            # try the default location\n            path = Path(__file__).parent / \"../.models.json\"\n            self.read_models_file(path)\n\n    def read_models_file(self, file_path):\n        \"\"\"Read .models.json as a dict\n\n        Args:\n            file_path (str): path to .models.json.\n        \"\"\"\n        with open(file_path, \"r\", encoding=\"utf-8\") as json_file:\n            self.models_dict = json.load(json_file)\n\n    def _list_models(self, model_type, model_count=0):\n        if self.verbose:\n            print(\" Name format: type/language/dataset/model\")\n        model_list = []\n        for lang in self.models_dict[model_type]:\n            for dataset in self.models_dict[model_type][lang]:\n                for model in self.models_dict[model_type][lang][dataset]:\n                    model_full_name = f\"{model_type}--{lang}--{dataset}--{model}\"\n                    output_path = os.path.join(self.output_prefix, model_full_name)\n                    if self.verbose:\n                        if os.path.exists(output_path):\n                            print(f\" {model_count}: {model_type}/{lang}/{dataset}/{model} [already downloaded]\")\n                        else:\n                            print(f\" {model_count}: {model_type}/{lang}/{dataset}/{model}\")\n                    model_list.append(f\"{model_type}/{lang}/{dataset}/{model}\")\n                    model_count += 1\n        return model_list\n\n    def _list_for_model_type(self, model_type):\n        models_name_list = []\n        model_count = 1\n        model_type = \"tts_models\"\n        models_name_list.extend(self._list_models(model_type, model_count))\n        return models_name_list\n\n    def list_models(self):\n        models_name_list = []\n        model_count = 1\n        for model_type in self.models_dict:\n            model_list = self._list_models(model_type, model_count)\n            models_name_list.extend(model_list)\n        return models_name_list\n\n    def model_info_by_idx(self, model_query):\n        \"\"\"Print the description of the model from .models.json file using model_idx\n\n        Args:\n            model_query (str): <model_tye>/<model_idx>\n        \"\"\"\n        model_name_list = []\n        model_type, model_query_idx = model_query.split(\"/\")\n        try:\n            model_query_idx = int(model_query_idx)\n            if model_query_idx <= 0:\n                print(\"> model_query_idx should be a positive integer!\")\n                return\n        except:\n            print(\"> model_query_idx should be an integer!\")\n            return\n        model_count = 0\n        if model_type in self.models_dict:\n            for lang in self.models_dict[model_type]:\n                for dataset in self.models_dict[model_type][lang]:\n                    for model in self.models_dict[model_type][lang][dataset]:\n                        model_name_list.append(f\"{model_type}/{lang}/{dataset}/{model}\")\n                        model_count += 1\n        else:\n            print(f\"> model_type {model_type} does not exist in the list.\")\n            return\n        if model_query_idx > model_count:\n            print(f\"model query idx exceeds the number of available models [{model_count}] \")\n        else:\n            model_type, lang, dataset, model = model_name_list[model_query_idx - 1].split(\"/\")\n            print(f\"> model type : {model_type}\")\n            print(f\"> language supported : {lang}\")\n            print(f\"> dataset used : {dataset}\")\n            print(f\"> model name : {model}\")\n            if \"description\" in self.models_dict[model_type][lang][dataset][model]:\n                print(f\"> description : {self.models_dict[model_type][lang][dataset][model]['description']}\")\n            else:\n                print(\"> description : coming soon\")\n            if \"default_vocoder\" in self.models_dict[model_type][lang][dataset][model]:\n                print(f\"> default_vocoder : {self.models_dict[model_type][lang][dataset][model]['default_vocoder']}\")\n\n    def model_info_by_full_name(self, model_query_name):\n        \"\"\"Print the description of the model from .models.json file using model_full_name\n\n        Args:\n            model_query_name (str): Format is <model_type>/<language>/<dataset>/<model_name>\n        \"\"\"\n        model_type, lang, dataset, model = model_query_name.split(\"/\")\n        if model_type in self.models_dict:\n            if lang in self.models_dict[model_type]:\n                if dataset in self.models_dict[model_type][lang]:\n                    if model in self.models_dict[model_type][lang][dataset]:\n                        print(f\"> model type : {model_type}\")\n                        print(f\"> language supported : {lang}\")\n                        print(f\"> dataset used : {dataset}\")\n                        print(f\"> model name : {model}\")\n                        if \"description\" in self.models_dict[model_type][lang][dataset][model]:\n                            print(\n                                f\"> description : {self.models_dict[model_type][lang][dataset][model]['description']}\"\n                            )\n                        else:\n                            print(\"> description : coming soon\")\n                        if \"default_vocoder\" in self.models_dict[model_type][lang][dataset][model]:\n                            print(\n                                f\"> default_vocoder : {self.models_dict[model_type][lang][dataset][model]['default_vocoder']}\"\n                            )\n                    else:\n                        print(f\"> model {model} does not exist for {model_type}/{lang}/{dataset}.\")\n                else:\n                    print(f\"> dataset {dataset} does not exist for {model_type}/{lang}.\")\n            else:\n                print(f\"> lang {lang} does not exist for {model_type}.\")\n        else:\n            print(f\"> model_type {model_type} does not exist in the list.\")\n\n    def list_tts_models(self):\n        \"\"\"Print all `TTS` models and return a list of model names\n\n        Format is `language/dataset/model`\n        \"\"\"\n        return self._list_for_model_type(\"tts_models\")\n\n    def list_vocoder_models(self):\n        \"\"\"Print all the `vocoder` models and return a list of model names\n\n        Format is `language/dataset/model`\n        \"\"\"\n        return self._list_for_model_type(\"vocoder_models\")\n\n    def list_vc_models(self):\n        \"\"\"Print all the voice conversion models and return a list of model names\n\n        Format is `language/dataset/model`\n        \"\"\"\n        return self._list_for_model_type(\"voice_conversion_models\")\n\n    def list_langs(self):\n        \"\"\"Print all the available languages\"\"\"\n        print(\" Name format: type/language\")\n        for model_type in self.models_dict:\n            for lang in self.models_dict[model_type]:\n                print(f\" >: {model_type}/{lang} \")\n\n    def list_datasets(self):\n        \"\"\"Print all the datasets\"\"\"\n        print(\" Name format: type/language/dataset\")\n        for model_type in self.models_dict:\n            for lang in self.models_dict[model_type]:\n                for dataset in self.models_dict[model_type][lang]:\n                    print(f\" >: {model_type}/{lang}/{dataset}\")\n\n    @staticmethod\n    def print_model_license(model_item: Dict):\n        \"\"\"Print the license of a model\n\n        Args:\n            model_item (dict): model item in the models.json\n        \"\"\"\n        if \"license\" in model_item and model_item[\"license\"].strip() != \"\":\n            print(f\" > Model's license - {model_item['license']}\")\n            if model_item[\"license\"].lower() in LICENSE_URLS:\n                print(f\" > Check {LICENSE_URLS[model_item['license'].lower()]} for more info.\")\n            else:\n                print(\" > Check https://opensource.org/licenses for more info.\")\n        else:\n            print(\" > Model's license - No license information available\")\n\n    def download_model(self, model_name):\n        \"\"\"Download model files given the full model name.\n        Model name is in the format\n            'type/language/dataset/model'\n            e.g. 'tts_model/en/ljspeech/tacotron'\n\n        Every model must have the following files:\n            - *.pth : pytorch model checkpoint file.\n            - config.json : model config file.\n            - scale_stats.npy (if exist): scale values for preprocessing.\n\n        Args:\n            model_name (str): model name as explained above.\n        \"\"\"\n        # fetch model info from the dict\n        model_type, lang, dataset, model = model_name.split(\"/\")\n        model_full_name = f\"{model_type}--{lang}--{dataset}--{model}\"\n        model_item = self.models_dict[model_type][lang][dataset][model]\n        model_item[\"model_type\"] = model_type\n        # set the model specific output path\n        output_path = os.path.join(self.output_prefix, model_full_name)\n        if os.path.exists(output_path):\n            print(f\" > {model_name} is already downloaded.\")\n        else:\n            os.makedirs(output_path, exist_ok=True)\n            print(f\" > Downloading model to {output_path}\")\n            # download from github release\n            self._download_zip_file(model_item[\"github_rls_url\"], output_path, self.progress_bar)\n            self.print_model_license(model_item=model_item)\n        # find downloaded files\n        output_model_path, output_config_path = self._find_files(output_path)\n        # update paths in the config.json\n        self._update_paths(output_path, output_config_path)\n        return output_model_path, output_config_path, model_item\n\n    @staticmethod\n    def _find_files(output_path: str) -> Tuple[str, str]:\n        \"\"\"Find the model and config files in the output path\n\n        Args:\n            output_path (str): path to the model files\n\n        Returns:\n            Tuple[str, str]: path to the model file and config file\n        \"\"\"\n        model_file = None\n        config_file = None\n        for file_name in os.listdir(output_path):\n            if file_name in [\"model_file.pth\", \"model_file.pth.tar\", \"model.pth\"]:\n                model_file = os.path.join(output_path, file_name)\n            elif file_name == \"config.json\":\n                config_file = os.path.join(output_path, file_name)\n        if model_file is None:\n            raise ValueError(\" [!] Model file not found in the output path\")\n        if config_file is None:\n            raise ValueError(\" [!] Config file not found in the output path\")\n        return model_file, config_file\n\n    @staticmethod\n    def _find_speaker_encoder(output_path: str) -> str:\n        \"\"\"Find the speaker encoder file in the output path\n\n        Args:\n            output_path (str): path to the model files\n\n        Returns:\n            str: path to the speaker encoder file\n        \"\"\"\n        speaker_encoder_file = None\n        for file_name in os.listdir(output_path):\n            if file_name in [\"model_se.pth\", \"model_se.pth.tar\"]:\n                speaker_encoder_file = os.path.join(output_path, file_name)\n        return speaker_encoder_file\n\n    def _update_paths(self, output_path: str, config_path: str) -> None:\n        \"\"\"Update paths for certain files in config.json after download.\n\n        Args:\n            output_path (str): local path the model is downloaded to.\n            config_path (str): local config.json path.\n        \"\"\"\n        output_stats_path = os.path.join(output_path, \"scale_stats.npy\")\n        output_d_vector_file_path = os.path.join(output_path, \"speakers.json\")\n        output_d_vector_file_pth_path = os.path.join(output_path, \"speakers.pth\")\n        output_speaker_ids_file_path = os.path.join(output_path, \"speaker_ids.json\")\n        output_speaker_ids_file_pth_path = os.path.join(output_path, \"speaker_ids.pth\")\n        speaker_encoder_config_path = os.path.join(output_path, \"config_se.json\")\n        speaker_encoder_model_path = self._find_speaker_encoder(output_path)\n\n        # update the scale_path.npy file path in the model config.json\n        self._update_path(\"audio.stats_path\", output_stats_path, config_path)\n\n        # update the speakers.json file path in the model config.json to the current path\n        self._update_path(\"d_vector_file\", output_d_vector_file_path, config_path)\n        self._update_path(\"d_vector_file\", output_d_vector_file_pth_path, config_path)\n        self._update_path(\"model_args.d_vector_file\", output_d_vector_file_path, config_path)\n        self._update_path(\"model_args.d_vector_file\", output_d_vector_file_pth_path, config_path)\n\n        # update the speaker_ids.json file path in the model config.json to the current path\n        self._update_path(\"speakers_file\", output_speaker_ids_file_path, config_path)\n        self._update_path(\"speakers_file\", output_speaker_ids_file_pth_path, config_path)\n        self._update_path(\"model_args.speakers_file\", output_speaker_ids_file_path, config_path)\n        self._update_path(\"model_args.speakers_file\", output_speaker_ids_file_pth_path, config_path)\n\n        # update the speaker_encoder file path in the model config.json to the current path\n        self._update_path(\"speaker_encoder_model_path\", speaker_encoder_model_path, config_path)\n        self._update_path(\"model_args.speaker_encoder_model_path\", speaker_encoder_model_path, config_path)\n        self._update_path(\"speaker_encoder_config_path\", speaker_encoder_config_path, config_path)\n        self._update_path(\"model_args.speaker_encoder_config_path\", speaker_encoder_config_path, config_path)\n\n    @staticmethod\n    def _update_path(field_name, new_path, config_path):\n        \"\"\"Update the path in the model config.json for the current environment after download\"\"\"\n        if new_path and os.path.exists(new_path):\n            config = load_config(config_path)\n            field_names = field_name.split(\".\")\n            if len(field_names) > 1:\n                # field name points to a sub-level field\n                sub_conf = config\n                for fd in field_names[:-1]:\n                    if fd in sub_conf:\n                        sub_conf = sub_conf[fd]\n                    else:\n                        return\n                if isinstance(sub_conf[field_names[-1]], list):\n                    sub_conf[field_names[-1]] = [new_path]\n                else:\n                    sub_conf[field_names[-1]] = new_path\n            else:\n                # field name points to a top-level field\n                if not field_name in config:\n                    return\n                if isinstance(config[field_name], list):\n                    config[field_name] = [new_path]\n                else:\n                    config[field_name] = new_path\n            config.save_json(config_path)\n\n    @staticmethod\n    def _download_zip_file(file_url, output_folder, progress_bar):\n        \"\"\"Download the github releases\"\"\"\n        # download the file\n        r = requests.get(file_url, stream=True)\n        # extract the file\n        try:\n            total_size_in_bytes = int(r.headers.get(\"content-length\", 0))\n            block_size = 1024  # 1 Kibibyte\n            if progress_bar:\n                progress_bar = tqdm(total=total_size_in_bytes, unit=\"iB\", unit_scale=True)\n            temp_zip_name = os.path.join(output_folder, file_url.split(\"/\")[-1])\n            with open(temp_zip_name, \"wb\") as file:\n                for data in r.iter_content(block_size):\n                    if progress_bar:\n                        progress_bar.update(len(data))\n                    file.write(data)\n            with zipfile.ZipFile(temp_zip_name) as z:\n                z.extractall(output_folder)\n            os.remove(temp_zip_name)  # delete zip after extract\n        except zipfile.BadZipFile:\n            print(f\" > Error: Bad zip file - {file_url}\")\n            raise zipfile.BadZipFile  # pylint: disable=raise-missing-from\n        # move the files to the outer path\n        for file_path in z.namelist()[1:]:\n            src_path = os.path.join(output_folder, file_path)\n            dst_path = os.path.join(output_folder, os.path.basename(file_path))\n            if src_path != dst_path:\n                copyfile(src_path, dst_path)\n        # remove the extracted folder\n        rmtree(os.path.join(output_folder, z.namelist()[0]))\n\n    @staticmethod\n    def _check_dict_key(my_dict, key):\n        if key in my_dict.keys() and my_dict[key] is not None:\n            if not isinstance(key, str):\n                return True\n            if isinstance(key, str) and len(my_dict[key]) > 0:\n                return True\n        return False\n"
  },
  {
    "path": "TTS/utils/radam.py",
    "content": "# modified from https://github.com/LiyuanLucasLiu/RAdam\n\nimport math\n\nimport torch\nfrom torch.optim.optimizer import Optimizer\n\n\nclass RAdam(Optimizer):\n    def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight_decay=0, degenerated_to_sgd=True):\n        if lr < 0.0:\n            raise ValueError(\"Invalid learning rate: {}\".format(lr))\n        if eps < 0.0:\n            raise ValueError(\"Invalid epsilon value: {}\".format(eps))\n        if not 0.0 <= betas[0] < 1.0:\n            raise ValueError(\"Invalid beta parameter at index 0: {}\".format(betas[0]))\n        if not 0.0 <= betas[1] < 1.0:\n            raise ValueError(\"Invalid beta parameter at index 1: {}\".format(betas[1]))\n\n        self.degenerated_to_sgd = degenerated_to_sgd\n        if isinstance(params, (list, tuple)) and len(params) > 0 and isinstance(params[0], dict):\n            for param in params:\n                if \"betas\" in param and (param[\"betas\"][0] != betas[0] or param[\"betas\"][1] != betas[1]):\n                    param[\"buffer\"] = [[None, None, None] for _ in range(10)]\n        defaults = dict(\n            lr=lr, betas=betas, eps=eps, weight_decay=weight_decay, buffer=[[None, None, None] for _ in range(10)]\n        )\n        super().__init__(params, defaults)\n\n    def __setstate__(self, state):  # pylint: disable=useless-super-delegation\n        super().__setstate__(state)\n\n    def step(self, closure=None):\n        loss = None\n        if closure is not None:\n            loss = closure()\n\n        for group in self.param_groups:\n            for p in group[\"params\"]:\n                if p.grad is None:\n                    continue\n                grad = p.grad.data.float()\n                if grad.is_sparse:\n                    raise RuntimeError(\"RAdam does not support sparse gradients\")\n\n                p_data_fp32 = p.data.float()\n\n                state = self.state[p]\n\n                if len(state) == 0:\n                    state[\"step\"] = 0\n                    state[\"exp_avg\"] = torch.zeros_like(p_data_fp32)\n                    state[\"exp_avg_sq\"] = torch.zeros_like(p_data_fp32)\n                else:\n                    state[\"exp_avg\"] = state[\"exp_avg\"].type_as(p_data_fp32)\n                    state[\"exp_avg_sq\"] = state[\"exp_avg_sq\"].type_as(p_data_fp32)\n\n                exp_avg, exp_avg_sq = state[\"exp_avg\"], state[\"exp_avg_sq\"]\n                beta1, beta2 = group[\"betas\"]\n\n                exp_avg_sq.mul_(beta2).addcmul_(grad, grad, value=1 - beta2)\n                exp_avg.mul_(beta1).add_(grad, alpha=1 - beta1)\n\n                state[\"step\"] += 1\n                buffered = group[\"buffer\"][int(state[\"step\"] % 10)]\n                if state[\"step\"] == buffered[0]:\n                    N_sma, step_size = buffered[1], buffered[2]\n                else:\n                    buffered[0] = state[\"step\"]\n                    beta2_t = beta2 ** state[\"step\"]\n                    N_sma_max = 2 / (1 - beta2) - 1\n                    N_sma = N_sma_max - 2 * state[\"step\"] * beta2_t / (1 - beta2_t)\n                    buffered[1] = N_sma\n\n                    # more conservative since it's an approximated value\n                    if N_sma >= 5:\n                        step_size = math.sqrt(\n                            (1 - beta2_t)\n                            * (N_sma - 4)\n                            / (N_sma_max - 4)\n                            * (N_sma - 2)\n                            / N_sma\n                            * N_sma_max\n                            / (N_sma_max - 2)\n                        ) / (1 - beta1 ** state[\"step\"])\n                    elif self.degenerated_to_sgd:\n                        step_size = 1.0 / (1 - beta1 ** state[\"step\"])\n                    else:\n                        step_size = -1\n                    buffered[2] = step_size\n\n                # more conservative since it's an approximated value\n                if N_sma >= 5:\n                    if group[\"weight_decay\"] != 0:\n                        p_data_fp32.add_(p_data_fp32, alpha=-group[\"weight_decay\"] * group[\"lr\"])\n                    denom = exp_avg_sq.sqrt().add_(group[\"eps\"])\n                    p_data_fp32.addcdiv_(exp_avg, denom, value=-step_size * group[\"lr\"])\n                    p.data.copy_(p_data_fp32)\n                elif step_size > 0:\n                    if group[\"weight_decay\"] != 0:\n                        p_data_fp32.add_(p_data_fp32, alpha=-group[\"weight_decay\"] * group[\"lr\"])\n                    p_data_fp32.add_(exp_avg, alpha=-step_size * group[\"lr\"])\n                    p.data.copy_(p_data_fp32)\n\n        return loss\n"
  },
  {
    "path": "TTS/utils/samplers.py",
    "content": "import math\nimport random\nfrom typing import Callable, List, Union\n\nfrom torch.utils.data.sampler import BatchSampler, Sampler, SubsetRandomSampler\n\n\nclass SubsetSampler(Sampler):\n    \"\"\"\n    Samples elements sequentially from a given list of indices.\n\n    Args:\n        indices (list): a sequence of indices\n    \"\"\"\n\n    def __init__(self, indices):\n        super().__init__(indices)\n        self.indices = indices\n\n    def __iter__(self):\n        return (self.indices[i] for i in range(len(self.indices)))\n\n    def __len__(self):\n        return len(self.indices)\n\n\nclass PerfectBatchSampler(Sampler):\n    \"\"\"\n    Samples a mini-batch of indices for a balanced class batching\n\n    Args:\n        dataset_items(list): dataset items to sample from.\n        classes (list): list of classes of dataset_items to sample from.\n        batch_size (int): total number of samples to be sampled in a mini-batch.\n        num_gpus (int): number of GPU in the data parallel mode.\n        shuffle (bool): if True, samples randomly, otherwise samples sequentially.\n        drop_last (bool): if True, drops last incomplete batch.\n    \"\"\"\n\n    def __init__(\n        self,\n        dataset_items,\n        classes,\n        batch_size,\n        num_classes_in_batch,\n        num_gpus=1,\n        shuffle=True,\n        drop_last=False,\n        label_key=\"class_name\",\n    ):\n        super().__init__(dataset_items)\n        assert (\n            batch_size % (num_classes_in_batch * num_gpus) == 0\n        ), \"Batch size must be divisible by number of classes times the number of data parallel devices (if enabled).\"\n\n        label_indices = {}\n        for idx, item in enumerate(dataset_items):\n            label = item[label_key]\n            if label not in label_indices.keys():\n                label_indices[label] = [idx]\n            else:\n                label_indices[label].append(idx)\n\n        if shuffle:\n            self._samplers = [SubsetRandomSampler(label_indices[key]) for key in classes]\n        else:\n            self._samplers = [SubsetSampler(label_indices[key]) for key in classes]\n\n        self._batch_size = batch_size\n        self._drop_last = drop_last\n        self._dp_devices = num_gpus\n        self._num_classes_in_batch = num_classes_in_batch\n\n    def __iter__(self):\n        batch = []\n        if self._num_classes_in_batch != len(self._samplers):\n            valid_samplers_idx = random.sample(range(len(self._samplers)), self._num_classes_in_batch)\n        else:\n            valid_samplers_idx = None\n\n        iters = [iter(s) for s in self._samplers]\n        done = False\n\n        while True:\n            b = []\n            for i, it in enumerate(iters):\n                if valid_samplers_idx is not None and i not in valid_samplers_idx:\n                    continue\n                idx = next(it, None)\n                if idx is None:\n                    done = True\n                    break\n                b.append(idx)\n            if done:\n                break\n            batch += b\n            if len(batch) == self._batch_size:\n                yield batch\n                batch = []\n                if valid_samplers_idx is not None:\n                    valid_samplers_idx = random.sample(range(len(self._samplers)), self._num_classes_in_batch)\n\n        if not self._drop_last:\n            if len(batch) > 0:\n                groups = len(batch) // self._num_classes_in_batch\n                if groups % self._dp_devices == 0:\n                    yield batch\n                else:\n                    batch = batch[: (groups // self._dp_devices) * self._dp_devices * self._num_classes_in_batch]\n                    if len(batch) > 0:\n                        yield batch\n\n    def __len__(self):\n        class_batch_size = self._batch_size // self._num_classes_in_batch\n        return min(((len(s) + class_batch_size - 1) // class_batch_size) for s in self._samplers)\n\n\ndef identity(x):\n    return x\n\n\nclass SortedSampler(Sampler):\n    \"\"\"Samples elements sequentially, always in the same order.\n\n    Taken from https://github.com/PetrochukM/PyTorch-NLP\n\n    Args:\n        data (iterable): Iterable data.\n        sort_key (callable): Specifies a function of one argument that is used to extract a\n            numerical comparison key from each list element.\n\n    Example:\n        >>> list(SortedSampler(range(10), sort_key=lambda i: -i))\n        [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]\n\n    \"\"\"\n\n    def __init__(self, data, sort_key: Callable = identity):\n        super().__init__(data)\n        self.data = data\n        self.sort_key = sort_key\n        zip_ = [(i, self.sort_key(row)) for i, row in enumerate(self.data)]\n        zip_ = sorted(zip_, key=lambda r: r[1])\n        self.sorted_indexes = [item[0] for item in zip_]\n\n    def __iter__(self):\n        return iter(self.sorted_indexes)\n\n    def __len__(self):\n        return len(self.data)\n\n\nclass BucketBatchSampler(BatchSampler):\n    \"\"\"Bucket batch sampler\n\n    Adapted from https://github.com/PetrochukM/PyTorch-NLP\n\n    Args:\n        sampler (torch.data.utils.sampler.Sampler):\n        batch_size (int): Size of mini-batch.\n        drop_last (bool): If `True` the sampler will drop the last batch if its size would be less\n            than `batch_size`.\n        data (list): List of data samples.\n        sort_key (callable, optional): Callable to specify a comparison key for sorting.\n        bucket_size_multiplier (int, optional): Buckets are of size\n            `batch_size * bucket_size_multiplier`.\n\n    Example:\n        >>> sampler = WeightedRandomSampler(weights, len(weights))\n        >>> sampler = BucketBatchSampler(sampler, data=data_items, batch_size=32, drop_last=True)\n    \"\"\"\n\n    def __init__(\n        self,\n        sampler,\n        data,\n        batch_size,\n        drop_last,\n        sort_key: Union[Callable, List] = identity,\n        bucket_size_multiplier=100,\n    ):\n        super().__init__(sampler, batch_size, drop_last)\n        self.data = data\n        self.sort_key = sort_key\n        _bucket_size = batch_size * bucket_size_multiplier\n        if hasattr(sampler, \"__len__\"):\n            _bucket_size = min(_bucket_size, len(sampler))\n        self.bucket_sampler = BatchSampler(sampler, _bucket_size, False)\n\n    def __iter__(self):\n        for idxs in self.bucket_sampler:\n            bucket_data = [self.data[idx] for idx in idxs]\n            sorted_sampler = SortedSampler(bucket_data, self.sort_key)\n            for batch_idx in SubsetRandomSampler(list(BatchSampler(sorted_sampler, self.batch_size, self.drop_last))):\n                sorted_idxs = [idxs[i] for i in batch_idx]\n                yield sorted_idxs\n\n    def __len__(self):\n        if self.drop_last:\n            return len(self.sampler) // self.batch_size\n        return math.ceil(len(self.sampler) / self.batch_size)\n"
  },
  {
    "path": "TTS/utils/synthesizer.py",
    "content": "import time\nfrom typing import List\n\nimport numpy as np\nimport pysbd\nimport torch\n\nfrom TTS.config import load_config\nfrom TTS.tts.models import setup_model as setup_tts_model\n\n# pylint: disable=unused-wildcard-import\n# pylint: disable=wildcard-import\nfrom TTS.tts.utils.synthesis import synthesis, transfer_voice, trim_silence\nfrom TTS.utils.audio import AudioProcessor\nfrom TTS.utils.audio.numpy_transforms import save_wav\nfrom TTS.vc.models import setup_model as setup_vc_model\nfrom TTS.vocoder.models import setup_model as setup_vocoder_model\nfrom TTS.vocoder.utils.generic_utils import interpolate_vocoder_input\n\n\nclass Synthesizer(object):\n    def __init__(\n        self,\n        tts_checkpoint: str = \"\",\n        tts_config_path: str = \"\",\n        tts_speakers_file: str = \"\",\n        tts_languages_file: str = \"\",\n        vocoder_checkpoint: str = \"\",\n        vocoder_config: str = \"\",\n        encoder_checkpoint: str = \"\",\n        encoder_config: str = \"\",\n        vc_checkpoint: str = \"\",\n        vc_config: str = \"\",\n        use_cuda: bool = False,\n    ) -> None:\n        \"\"\"General 🐸 TTS interface for inference. It takes a tts and a vocoder\n        model and synthesize speech from the provided text.\n\n        The text is divided into a list of sentences using `pysbd` and synthesize\n        speech on each sentence separately.\n\n        If you have certain special characters in your text, you need to handle\n        them before providing the text to Synthesizer.\n\n        TODO: set the segmenter based on the source language\n\n        Args:\n            tts_checkpoint (str, optional): path to the tts model file.\n            tts_config_path (str, optional): path to the tts config file.\n            vocoder_checkpoint (str, optional): path to the vocoder model file. Defaults to None.\n            vocoder_config (str, optional): path to the vocoder config file. Defaults to None.\n            encoder_checkpoint (str, optional): path to the speaker encoder model file. Defaults to `\"\"`,\n            encoder_config (str, optional): path to the speaker encoder config file. Defaults to `\"\"`,\n            vc_checkpoint (str, optional): path to the voice conversion model file. Defaults to `\"\"`,\n            vc_config (str, optional): path to the voice conversion config file. Defaults to `\"\"`,\n            use_cuda (bool, optional): enable/disable cuda. Defaults to False.\n        \"\"\"\n        self.tts_checkpoint = tts_checkpoint\n        self.tts_config_path = tts_config_path\n        self.tts_speakers_file = tts_speakers_file\n        self.tts_languages_file = tts_languages_file\n        self.vocoder_checkpoint = vocoder_checkpoint\n        self.vocoder_config = vocoder_config\n        self.encoder_checkpoint = encoder_checkpoint\n        self.encoder_config = encoder_config\n        self.vc_checkpoint = vc_checkpoint\n        self.vc_config = vc_config\n        self.use_cuda = use_cuda\n\n        self.tts_model = None\n        self.vocoder_model = None\n        self.vc_model = None\n        self.speaker_manager = None\n        self.tts_speakers = {}\n        self.language_manager = None\n        self.num_languages = 0\n        self.tts_languages = {}\n        self.d_vector_dim = 0\n        self.seg = self._get_segmenter(\"en\")\n        self.use_cuda = use_cuda\n\n        if self.use_cuda:\n            assert torch.cuda.is_available(), \"CUDA is not availabe on this machine.\"\n\n        if tts_checkpoint:\n            self._load_tts(tts_checkpoint, tts_config_path, use_cuda)\n            self.output_sample_rate = self.tts_config.audio[\"sample_rate\"]\n\n        if vocoder_checkpoint:\n            self._load_vocoder(vocoder_checkpoint, vocoder_config, use_cuda)\n            self.output_sample_rate = self.vocoder_config.audio[\"sample_rate\"]\n\n        if vc_checkpoint:\n            self._load_vc(vc_checkpoint, vc_config, use_cuda)\n            self.output_sample_rate = self.vc_config.audio[\"output_sample_rate\"]\n\n    @staticmethod\n    def _get_segmenter(lang: str):\n        \"\"\"get the sentence segmenter for the given language.\n\n        Args:\n            lang (str): target language code.\n\n        Returns:\n            [type]: [description]\n        \"\"\"\n        return pysbd.Segmenter(language=lang, clean=True)\n\n    def _load_vc(self, vc_checkpoint: str, vc_config_path: str, use_cuda: bool) -> None:\n        \"\"\"Load the voice conversion model.\n\n        1. Load the model config.\n        2. Init the model from the config.\n        3. Load the model weights.\n        4. Move the model to the GPU if CUDA is enabled.\n\n        Args:\n            vc_checkpoint (str): path to the model checkpoint.\n            tts_config_path (str): path to the model config file.\n            use_cuda (bool): enable/disable CUDA use.\n        \"\"\"\n        # pylint: disable=global-statement\n        self.vc_config = load_config(vc_config_path)\n        self.vc_model = setup_vc_model(config=self.vc_config)\n        self.vc_model.load_checkpoint(self.vc_config, vc_checkpoint)\n        if use_cuda:\n            self.vc_model.cuda()\n\n    def _load_tts(self, tts_checkpoint: str, tts_config_path: str, use_cuda: bool) -> None:\n        \"\"\"Load the TTS model.\n\n        1. Load the model config.\n        2. Init the model from the config.\n        3. Load the model weights.\n        4. Move the model to the GPU if CUDA is enabled.\n        5. Init the speaker manager in the model.\n\n        Args:\n            tts_checkpoint (str): path to the model checkpoint.\n            tts_config_path (str): path to the model config file.\n            use_cuda (bool): enable/disable CUDA use.\n        \"\"\"\n        # pylint: disable=global-statement\n        self.tts_config = load_config(tts_config_path)\n        if self.tts_config[\"use_phonemes\"] and self.tts_config[\"phonemizer\"] is None:\n            raise ValueError(\"Phonemizer is not defined in the TTS config.\")\n\n        self.tts_model = setup_tts_model(config=self.tts_config)\n\n        if not self.encoder_checkpoint:\n            self._set_speaker_encoder_paths_from_tts_config()\n\n        self.tts_model.load_checkpoint(self.tts_config, tts_checkpoint, eval=True)\n        if use_cuda:\n            self.tts_model.cuda()\n\n        if self.encoder_checkpoint and hasattr(self.tts_model, \"speaker_manager\"):\n            self.tts_model.speaker_manager.init_encoder(self.encoder_checkpoint, self.encoder_config, use_cuda)\n\n    def _set_speaker_encoder_paths_from_tts_config(self):\n        \"\"\"Set the encoder paths from the tts model config for models with speaker encoders.\"\"\"\n        if hasattr(self.tts_config, \"model_args\") and hasattr(\n            self.tts_config.model_args, \"speaker_encoder_config_path\"\n        ):\n            self.encoder_checkpoint = self.tts_config.model_args.speaker_encoder_model_path\n            self.encoder_config = self.tts_config.model_args.speaker_encoder_config_path\n\n    def _load_vocoder(self, model_file: str, model_config: str, use_cuda: bool) -> None:\n        \"\"\"Load the vocoder model.\n\n        1. Load the vocoder config.\n        2. Init the AudioProcessor for the vocoder.\n        3. Init the vocoder model from the config.\n        4. Move the model to the GPU if CUDA is enabled.\n\n        Args:\n            model_file (str): path to the model checkpoint.\n            model_config (str): path to the model config file.\n            use_cuda (bool): enable/disable CUDA use.\n        \"\"\"\n        self.vocoder_config = load_config(model_config)\n        self.vocoder_ap = AudioProcessor(verbose=False, **self.vocoder_config.audio)\n        self.vocoder_model = setup_vocoder_model(self.vocoder_config)\n        self.vocoder_model.load_checkpoint(self.vocoder_config, model_file, eval=True)\n        if use_cuda:\n            self.vocoder_model.cuda()\n\n    def split_into_sentences(self, text) -> List[str]:\n        \"\"\"Split give text into sentences.\n\n        Args:\n            text (str): input text in string format.\n\n        Returns:\n            List[str]: list of sentences.\n        \"\"\"\n        return self.seg.segment(text)\n\n    def save_wav(self, wav: List[int], path: str) -> None:\n        \"\"\"Save the waveform as a file.\n\n        Args:\n            wav (List[int]): waveform as a list of values.\n            path (str): output path to save the waveform.\n        \"\"\"\n        wav = np.array(wav)\n        save_wav(wav=wav, path=path, sample_rate=self.output_sample_rate)\n\n    def voice_conversion(self, source_wav: str, target_wav: str) -> List[int]:\n        output_wav = self.vc_model.voice_conversion(source_wav, target_wav)\n        return output_wav\n\n    def tts(\n        self,\n        text: str = \"\",\n        speaker_name: str = \"\",\n        language_name: str = \"\",\n        speaker_wav=None,\n        style_wav=None,\n        style_text=None,\n        reference_wav=None,\n        reference_speaker_name=None,\n    ) -> List[int]:\n        \"\"\"🐸 TTS magic. Run all the models and generate speech.\n\n        Args:\n            text (str): input text.\n            speaker_name (str, optional): spekaer id for multi-speaker models. Defaults to \"\".\n            language_name (str, optional): language id for multi-language models. Defaults to \"\".\n            speaker_wav (Union[str, List[str]], optional): path to the speaker wav for voice cloning. Defaults to None.\n            style_wav ([type], optional): style waveform for GST. Defaults to None.\n            style_text ([type], optional): transcription of style_wav for Capacitron. Defaults to None.\n            reference_wav ([type], optional): reference waveform for voice conversion. Defaults to None.\n            reference_speaker_name ([type], optional): spekaer id of reference waveform. Defaults to None.\n        Returns:\n            List[int]: [description]\n        \"\"\"\n        start_time = time.time()\n        wavs = []\n\n        if not text and not reference_wav:\n            raise ValueError(\n                \"You need to define either `text` (for sythesis) or a `reference_wav` (for voice conversion) to use the Coqui TTS API.\"\n            )\n\n        if text:\n            sens = self.split_into_sentences(text)\n            print(\" > Text splitted to sentences.\")\n            print(sens)\n\n        # handle multi-speaker\n        speaker_embedding = None\n        speaker_id = None\n        if self.tts_speakers_file or hasattr(self.tts_model.speaker_manager, \"name_to_id\"):\n            # handle Neon models with single speaker.\n            if len(self.tts_model.speaker_manager.name_to_id) == 1:\n                speaker_id = list(self.tts_model.speaker_manager.name_to_id.values())[0]\n\n            elif speaker_name and isinstance(speaker_name, str):\n                if self.tts_config.use_d_vector_file:\n                    # get the average speaker embedding from the saved d_vectors.\n                    speaker_embedding = self.tts_model.speaker_manager.get_mean_embedding(\n                        speaker_name, num_samples=None, randomize=False\n                    )\n                    speaker_embedding = np.array(speaker_embedding)[None, :]  # [1 x embedding_dim]\n                else:\n                    # get speaker idx from the speaker name\n                    speaker_id = self.tts_model.speaker_manager.name_to_id[speaker_name]\n\n            elif not speaker_name and not speaker_wav:\n                raise ValueError(\n                    \" [!] Look like you use a multi-speaker model. \"\n                    \"You need to define either a `speaker_name` or a `speaker_wav` to use a multi-speaker model.\"\n                )\n            else:\n                speaker_embedding = None\n        else:\n            if speaker_name:\n                raise ValueError(\n                    f\" [!] Missing speakers.json file path for selecting speaker {speaker_name}.\"\n                    \"Define path for speaker.json if it is a multi-speaker model or remove defined speaker idx. \"\n                )\n\n        # handle multi-lingual\n        language_id = None\n        if self.tts_languages_file or (\n            hasattr(self.tts_model, \"language_manager\") and self.tts_model.language_manager is not None\n        ):\n            if len(self.tts_model.language_manager.name_to_id) == 1:\n                language_id = list(self.tts_model.language_manager.name_to_id.values())[0]\n\n            elif language_name and isinstance(language_name, str):\n                language_id = self.tts_model.language_manager.name_to_id[language_name]\n\n            elif not language_name:\n                raise ValueError(\n                    \" [!] Look like you use a multi-lingual model. \"\n                    \"You need to define either a `language_name` or a `style_wav` to use a multi-lingual model.\"\n                )\n\n            else:\n                raise ValueError(\n                    f\" [!] Missing language_ids.json file path for selecting language {language_name}.\"\n                    \"Define path for language_ids.json if it is a multi-lingual model or remove defined language idx. \"\n                )\n\n        # compute a new d_vector from the given clip.\n        if speaker_wav is not None:\n            speaker_embedding = self.tts_model.speaker_manager.compute_embedding_from_clip(speaker_wav)\n\n        use_gl = self.vocoder_model is None\n\n        if not reference_wav:\n            for sen in sens:\n                # synthesize voice\n                outputs = synthesis(\n                    model=self.tts_model,\n                    text=sen,\n                    CONFIG=self.tts_config,\n                    use_cuda=self.use_cuda,\n                    speaker_id=speaker_id,\n                    style_wav=style_wav,\n                    style_text=style_text,\n                    use_griffin_lim=use_gl,\n                    d_vector=speaker_embedding,\n                    language_id=language_id,\n                )\n                waveform = outputs[\"wav\"]\n                mel_postnet_spec = outputs[\"outputs\"][\"model_outputs\"][0].detach().cpu().numpy()\n                if not use_gl:\n                    # denormalize tts output based on tts audio config\n                    mel_postnet_spec = self.tts_model.ap.denormalize(mel_postnet_spec.T).T\n                    device_type = \"cuda\" if self.use_cuda else \"cpu\"\n                    # renormalize spectrogram based on vocoder config\n                    vocoder_input = self.vocoder_ap.normalize(mel_postnet_spec.T)\n                    # compute scale factor for possible sample rate mismatch\n                    scale_factor = [\n                        1,\n                        self.vocoder_config[\"audio\"][\"sample_rate\"] / self.tts_model.ap.sample_rate,\n                    ]\n                    if scale_factor[1] != 1:\n                        print(\" > interpolating tts model output.\")\n                        vocoder_input = interpolate_vocoder_input(scale_factor, vocoder_input)\n                    else:\n                        vocoder_input = torch.tensor(vocoder_input).unsqueeze(0)  # pylint: disable=not-callable\n                    # run vocoder model\n                    # [1, T, C]\n                    waveform = self.vocoder_model.inference(vocoder_input.to(device_type))\n                if self.use_cuda and not use_gl:\n                    waveform = waveform.cpu()\n                if not use_gl:\n                    waveform = waveform.numpy()\n                waveform = waveform.squeeze()\n\n                # trim silence\n                if \"do_trim_silence\" in self.tts_config.audio and self.tts_config.audio[\"do_trim_silence\"]:\n                    waveform = trim_silence(waveform, self.tts_model.ap)\n\n                wavs += list(waveform)\n                wavs += [0] * 10000\n        else:\n            # get the speaker embedding or speaker id for the reference wav file\n            reference_speaker_embedding = None\n            reference_speaker_id = None\n            if self.tts_speakers_file or hasattr(self.tts_model.speaker_manager, \"name_to_id\"):\n                if reference_speaker_name and isinstance(reference_speaker_name, str):\n                    if self.tts_config.use_d_vector_file:\n                        # get the speaker embedding from the saved d_vectors.\n                        reference_speaker_embedding = self.tts_model.speaker_manager.get_embeddings_by_name(\n                            reference_speaker_name\n                        )[0]\n                        reference_speaker_embedding = np.array(reference_speaker_embedding)[\n                            None, :\n                        ]  # [1 x embedding_dim]\n                    else:\n                        # get speaker idx from the speaker name\n                        reference_speaker_id = self.tts_model.speaker_manager.name_to_id[reference_speaker_name]\n                else:\n                    reference_speaker_embedding = self.tts_model.speaker_manager.compute_embedding_from_clip(\n                        reference_wav\n                    )\n            outputs = transfer_voice(\n                model=self.tts_model,\n                CONFIG=self.tts_config,\n                use_cuda=self.use_cuda,\n                reference_wav=reference_wav,\n                speaker_id=speaker_id,\n                d_vector=speaker_embedding,\n                use_griffin_lim=use_gl,\n                reference_speaker_id=reference_speaker_id,\n                reference_d_vector=reference_speaker_embedding,\n            )\n            waveform = outputs\n            if not use_gl:\n                mel_postnet_spec = outputs[0].detach().cpu().numpy()\n                # denormalize tts output based on tts audio config\n                mel_postnet_spec = self.tts_model.ap.denormalize(mel_postnet_spec.T).T\n                device_type = \"cuda\" if self.use_cuda else \"cpu\"\n                # renormalize spectrogram based on vocoder config\n                vocoder_input = self.vocoder_ap.normalize(mel_postnet_spec.T)\n                # compute scale factor for possible sample rate mismatch\n                scale_factor = [\n                    1,\n                    self.vocoder_config[\"audio\"][\"sample_rate\"] / self.tts_model.ap.sample_rate,\n                ]\n                if scale_factor[1] != 1:\n                    print(\" > interpolating tts model output.\")\n                    vocoder_input = interpolate_vocoder_input(scale_factor, vocoder_input)\n                else:\n                    vocoder_input = torch.tensor(vocoder_input).unsqueeze(0)  # pylint: disable=not-callable\n                # run vocoder model\n                # [1, T, C]\n                waveform = self.vocoder_model.inference(vocoder_input.to(device_type))\n            if self.use_cuda:\n                waveform = waveform.cpu()\n            if not use_gl:\n                waveform = waveform.numpy()\n            wavs = waveform.squeeze()\n\n        # compute stats\n        process_time = time.time() - start_time\n        audio_time = len(wavs) / self.tts_config.audio[\"sample_rate\"]\n        print(f\" > Processing time: {process_time}\")\n        print(f\" > Real-time factor: {process_time / audio_time}\")\n        return wavs\n"
  },
  {
    "path": "TTS/utils/training.py",
    "content": "import numpy as np\nimport torch\n\n\ndef check_update(model, grad_clip, ignore_stopnet=False, amp_opt_params=None):\n    r\"\"\"Check model gradient against unexpected jumps and failures\"\"\"\n    skip_flag = False\n    if ignore_stopnet:\n        if not amp_opt_params:\n            grad_norm = torch.nn.utils.clip_grad_norm_(\n                [param for name, param in model.named_parameters() if \"stopnet\" not in name], grad_clip\n            )\n        else:\n            grad_norm = torch.nn.utils.clip_grad_norm_(amp_opt_params, grad_clip)\n    else:\n        if not amp_opt_params:\n            grad_norm = torch.nn.utils.clip_grad_norm_(model.parameters(), grad_clip)\n        else:\n            grad_norm = torch.nn.utils.clip_grad_norm_(amp_opt_params, grad_clip)\n\n    # compatibility with different torch versions\n    if isinstance(grad_norm, float):\n        if np.isinf(grad_norm):\n            print(\" | > Gradient is INF !!\")\n            skip_flag = True\n    else:\n        if torch.isinf(grad_norm):\n            print(\" | > Gradient is INF !!\")\n            skip_flag = True\n    return grad_norm, skip_flag\n\n\ndef gradual_training_scheduler(global_step, config):\n    \"\"\"Setup the gradual training schedule wrt number\n    of active GPUs\"\"\"\n    num_gpus = torch.cuda.device_count()\n    if num_gpus == 0:\n        num_gpus = 1\n    new_values = None\n    # we set the scheduling wrt num_gpus\n    for values in config.gradual_training:\n        if global_step * num_gpus >= values[0]:\n            new_values = values\n    return new_values[1], new_values[2]\n"
  },
  {
    "path": "TTS/utils/vad.py",
    "content": "import soundfile as sf\nimport torch\nimport torchaudio\n\n\ndef read_audio(path):\n    wav, sr = torchaudio.load(path)\n\n    if wav.size(0) > 1:\n        wav = wav.mean(dim=0, keepdim=True)\n\n    return wav.squeeze(0), sr\n\n\ndef resample_wav(wav, sr, new_sr):\n    wav = wav.unsqueeze(0)\n    transform = torchaudio.transforms.Resample(orig_freq=sr, new_freq=new_sr)\n    wav = transform(wav)\n    return wav.squeeze(0)\n\n\ndef map_timestamps_to_new_sr(vad_sr, new_sr, timestamps, just_begging_end=False):\n    factor = new_sr / vad_sr\n    new_timestamps = []\n    if just_begging_end and timestamps:\n        # get just the start and end timestamps\n        new_dict = {\"start\": int(timestamps[0][\"start\"] * factor), \"end\": int(timestamps[-1][\"end\"] * factor)}\n        new_timestamps.append(new_dict)\n    else:\n        for ts in timestamps:\n            # map to the new SR\n            new_dict = {\"start\": int(ts[\"start\"] * factor), \"end\": int(ts[\"end\"] * factor)}\n            new_timestamps.append(new_dict)\n\n    return new_timestamps\n\n\ndef get_vad_model_and_utils(use_cuda=False):\n    model, utils = torch.hub.load(repo_or_dir=\"snakers4/silero-vad\", model=\"silero_vad\", force_reload=True, onnx=False)\n    if use_cuda:\n        model = model.cuda()\n\n    get_speech_timestamps, save_audio, _, _, collect_chunks = utils\n    return model, get_speech_timestamps, save_audio, collect_chunks\n\n\ndef remove_silence(\n    model_and_utils, audio_path, out_path, vad_sample_rate=8000, trim_just_beginning_and_end=True, use_cuda=False\n):\n    # get the VAD model and utils functions\n    model, get_speech_timestamps, _, collect_chunks = model_and_utils\n\n    # read ground truth wav and resample the audio for the VAD\n    wav, gt_sample_rate = read_audio(audio_path)\n\n    # if needed, resample the audio for the VAD model\n    if gt_sample_rate != vad_sample_rate:\n        wav_vad = resample_wav(wav, gt_sample_rate, vad_sample_rate)\n    else:\n        wav_vad = wav\n\n    if use_cuda:\n        wav_vad = wav_vad.cuda()\n\n    # get speech timestamps from full audio file\n    speech_timestamps = get_speech_timestamps(wav_vad, model, sampling_rate=vad_sample_rate, window_size_samples=768)\n\n    # map the current speech_timestamps to the sample rate of the ground truth audio\n    new_speech_timestamps = map_timestamps_to_new_sr(\n        vad_sample_rate, gt_sample_rate, speech_timestamps, trim_just_beginning_and_end\n    )\n\n    # if have speech timestamps else save the wav\n    if new_speech_timestamps:\n        wav = collect_chunks(new_speech_timestamps, wav)\n        is_speech = True\n    else:\n        print(f\"> The file {audio_path} probably does not have speech please check it !!\")\n        is_speech = False\n\n    # save audio\n    sf.write(out_path, wav, gt_sample_rate, subtype=\"PCM_16\")\n    return out_path, is_speech\n"
  },
  {
    "path": "TTS/vc/configs/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/vc/configs/freevc_config.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import List\n\nfrom TTS.vc.configs.shared_configs import BaseVCConfig\nfrom TTS.vc.models.freevc import FreeVCArgs, FreeVCAudioConfig, FreeVCConfig\n"
  },
  {
    "path": "TTS/vc/configs/shared_configs.py",
    "content": "from dataclasses import asdict, dataclass, field\nfrom typing import Dict, List\n\nfrom coqpit import Coqpit, check_argument\n\nfrom TTS.config import BaseAudioConfig, BaseDatasetConfig, BaseTrainingConfig\n\n\n@dataclass\nclass BaseVCConfig(BaseTrainingConfig):\n    \"\"\"Shared parameters among all the tts models.\n\n    Args:\n\n        audio (BaseAudioConfig):\n            Audio processor config object instance.\n\n        batch_group_size (int):\n            Size of the batch groups used for bucketing. By default, the dataloader orders samples by the sequence\n            length for a more efficient and stable training. If `batch_group_size > 1` then it performs bucketing to\n            prevent using the same batches for each epoch.\n\n        loss_masking (bool):\n            enable / disable masking loss values against padded segments of samples in a batch.\n\n        min_text_len (int):\n            Minimum length of input text to be used. All shorter samples will be ignored. Defaults to 0.\n\n        max_text_len (int):\n            Maximum length of input text to be used. All longer samples will be ignored. Defaults to float(\"inf\").\n\n        min_audio_len (int):\n            Minimum length of input audio to be used. All shorter samples will be ignored. Defaults to 0.\n\n        max_audio_len (int):\n            Maximum length of input audio to be used. All longer samples will be ignored. The maximum length in the\n            dataset defines the VRAM used in the training. Hence, pay attention to this value if you encounter an\n            OOM error in training. Defaults to float(\"inf\").\n\n        compute_f0 (int):\n            (Not in use yet).\n\n        compute_energy (int):\n            (Not in use yet).\n\n        compute_linear_spec (bool):\n            If True data loader computes and returns linear spectrograms alongside the other data.\n\n        precompute_num_workers (int):\n            Number of workers to precompute features. Defaults to 0.\n\n        use_noise_augment (bool):\n            Augment the input audio with random noise.\n\n        start_by_longest (bool):\n            If True, the data loader will start loading the longest batch first. It is useful for checking OOM issues.\n            Defaults to False.\n\n        shuffle (bool):\n            If True, the data loader will shuffle the dataset when there is not sampler defined. Defaults to True.\n\n        drop_last (bool):\n            If True, the data loader will drop the last batch if it is not complete. It helps to prevent\n            issues that emerge from the partial batch statistics. Defaults to True.\n\n        add_blank (bool):\n            Add blank characters between each other two characters. It improves performance for some models at expense\n            of slower run-time due to the longer input sequence.\n\n        datasets (List[BaseDatasetConfig]):\n            List of datasets used for training. If multiple datasets are provided, they are merged and used together\n            for training.\n\n        optimizer (str):\n            Optimizer used for the training. Set one from `torch.optim.Optimizer` or `TTS.utils.training`.\n            Defaults to ``.\n\n        optimizer_params (dict):\n            Optimizer kwargs. Defaults to `{\"betas\": [0.8, 0.99], \"weight_decay\": 0.0}`\n\n        lr_scheduler (str):\n            Learning rate scheduler for the training. Use one from `torch.optim.Scheduler` schedulers or\n            `TTS.utils.training`. Defaults to ``.\n\n        lr_scheduler_params (dict):\n            Parameters for the generator learning rate scheduler. Defaults to `{\"warmup\": 4000}`.\n\n        test_sentences (List[str]):\n            List of sentences to be used at testing. Defaults to '[]'\n\n        eval_split_max_size (int):\n            Number maximum of samples to be used for evaluation in proportion split. Defaults to None (Disabled).\n\n        eval_split_size (float):\n            If between 0.0 and 1.0 represents the proportion of the dataset to include in the evaluation set.\n            If > 1, represents the absolute number of evaluation samples. Defaults to 0.01 (1%).\n\n        use_speaker_weighted_sampler (bool):\n            Enable / Disable the batch balancer by speaker. Defaults to ```False```.\n\n        speaker_weighted_sampler_alpha (float):\n            Number that control the influence of the speaker sampler weights. Defaults to ```1.0```.\n\n        use_language_weighted_sampler (bool):\n            Enable / Disable the batch balancer by language. Defaults to ```False```.\n\n        language_weighted_sampler_alpha (float):\n            Number that control the influence of the language sampler weights. Defaults to ```1.0```.\n\n        use_length_weighted_sampler (bool):\n            Enable / Disable the batch balancer by audio length. If enabled the dataset will be divided\n            into 10 buckets considering the min and max audio of the dataset. The sampler weights will be\n            computed forcing to have the same quantity of data for each bucket in each training batch. Defaults to ```False```.\n\n        length_weighted_sampler_alpha (float):\n            Number that control the influence of the length sampler weights. Defaults to ```1.0```.\n    \"\"\"\n\n    audio: BaseAudioConfig = field(default_factory=BaseAudioConfig)\n    # training params\n    batch_group_size: int = 0\n    loss_masking: bool = None\n    # dataloading\n    min_audio_len: int = 1\n    max_audio_len: int = float(\"inf\")\n    min_text_len: int = 1\n    max_text_len: int = float(\"inf\")\n    compute_f0: bool = False\n    compute_energy: bool = False\n    compute_linear_spec: bool = False\n    precompute_num_workers: int = 0\n    use_noise_augment: bool = False\n    start_by_longest: bool = False\n    shuffle: bool = False\n    drop_last: bool = False\n    # dataset\n    datasets: List[BaseDatasetConfig] = field(default_factory=lambda: [BaseDatasetConfig()])\n    # optimizer\n    optimizer: str = \"radam\"\n    optimizer_params: dict = None\n    # scheduler\n    lr_scheduler: str = None\n    lr_scheduler_params: dict = field(default_factory=lambda: {})\n    # testing\n    test_sentences: List[str] = field(default_factory=lambda: [])\n    # evaluation\n    eval_split_max_size: int = None\n    eval_split_size: float = 0.01\n    # weighted samplers\n    use_speaker_weighted_sampler: bool = False\n    speaker_weighted_sampler_alpha: float = 1.0\n    use_language_weighted_sampler: bool = False\n    language_weighted_sampler_alpha: float = 1.0\n    use_length_weighted_sampler: bool = False\n    length_weighted_sampler_alpha: float = 1.0\n"
  },
  {
    "path": "TTS/vc/models/__init__.py",
    "content": "import importlib\nimport re\nfrom typing import Dict, List, Union\n\n\ndef to_camel(text):\n    text = text.capitalize()\n    return re.sub(r\"(?!^)_([a-zA-Z])\", lambda m: m.group(1).upper(), text)\n\n\ndef setup_model(config: \"Coqpit\", samples: Union[List[List], List[Dict]] = None) -> \"BaseVC\":\n    print(\" > Using model: {}\".format(config.model))\n    # fetch the right model implementation.\n    if \"model\" in config and config[\"model\"].lower() == \"freevc\":\n        MyModel = importlib.import_module(\"TTS.vc.models.freevc\").FreeVC\n        model = MyModel.init_from_config(config, samples)\n    return model\n"
  },
  {
    "path": "TTS/vc/models/base_vc.py",
    "content": "import os\nimport random\nfrom typing import Dict, List, Tuple, Union\n\nimport torch\nimport torch.distributed as dist\nfrom coqpit import Coqpit\nfrom torch import nn\nfrom torch.utils.data import DataLoader\nfrom torch.utils.data.sampler import WeightedRandomSampler\nfrom trainer.torch import DistributedSampler, DistributedSamplerWrapper\n\nfrom TTS.model import BaseTrainerModel\nfrom TTS.tts.datasets.dataset import TTSDataset\nfrom TTS.tts.utils.data import get_length_balancer_weights\nfrom TTS.tts.utils.languages import LanguageManager, get_language_balancer_weights\nfrom TTS.tts.utils.speakers import SpeakerManager, get_speaker_balancer_weights\nfrom TTS.tts.utils.synthesis import synthesis\nfrom TTS.tts.utils.visual import plot_alignment, plot_spectrogram\n\n# pylint: skip-file\n\n\nclass BaseVC(BaseTrainerModel):\n    \"\"\"Base `vc` class. Every new `vc` model must inherit this.\n\n    It defines common `vc` specific functions on top of `Model` implementation.\n    \"\"\"\n\n    MODEL_TYPE = \"vc\"\n\n    def __init__(\n        self,\n        config: Coqpit,\n        ap: \"AudioProcessor\",\n        speaker_manager: SpeakerManager = None,\n        language_manager: LanguageManager = None,\n    ):\n        super().__init__()\n        self.config = config\n        self.ap = ap\n        self.speaker_manager = speaker_manager\n        self.language_manager = language_manager\n        self._set_model_args(config)\n\n    def _set_model_args(self, config: Coqpit):\n        \"\"\"Setup model args based on the config type (`ModelConfig` or `ModelArgs`).\n\n        `ModelArgs` has all the fields reuqired to initialize the model architecture.\n\n        `ModelConfig` has all the fields required for training, inference and containes `ModelArgs`.\n\n        If the config is for training with a name like \"*Config\", then the model args are embeded in the\n        config.model_args\n\n        If the config is for the model with a name like \"*Args\", then we assign the directly.\n        \"\"\"\n        # don't use isintance not to import recursively\n        if \"Config\" in config.__class__.__name__:\n            self.config = config\n            self.args = config.model_args\n        elif \"Args\" in config.__class__.__name__:\n            self.args = config\n        else:\n            raise ValueError(\"config must be either a *Config or *Args\")\n\n    def init_multispeaker(self, config: Coqpit, data: List = None):\n        \"\"\"Initialize a speaker embedding layer if needen and define expected embedding channel size for defining\n        `in_channels` size of the connected layers.\n\n        This implementation yields 3 possible outcomes:\n\n        1. If `config.use_speaker_embedding` and `config.use_d_vector_file are False, do nothing.\n        2. If `config.use_d_vector_file` is True, set expected embedding channel size to `config.d_vector_dim` or 512.\n        3. If `config.use_speaker_embedding`, initialize a speaker embedding layer with channel size of\n        `config.d_vector_dim` or 512.\n\n        You can override this function for new models.\n\n        Args:\n            config (Coqpit): Model configuration.\n        \"\"\"\n        # set number of speakers\n        if self.speaker_manager is not None:\n            self.num_speakers = self.speaker_manager.num_speakers\n        elif hasattr(config, \"num_speakers\"):\n            self.num_speakers = config.num_speakers\n\n        # set ultimate speaker embedding size\n        if config.use_speaker_embedding or config.use_d_vector_file:\n            self.embedded_speaker_dim = (\n                config.d_vector_dim if \"d_vector_dim\" in config and config.d_vector_dim is not None else 512\n            )\n        # init speaker embedding layer\n        if config.use_speaker_embedding and not config.use_d_vector_file:\n            print(\" > Init speaker_embedding layer.\")\n            self.speaker_embedding = nn.Embedding(self.num_speakers, self.embedded_speaker_dim)\n            self.speaker_embedding.weight.data.normal_(0, 0.3)\n\n    def get_aux_input(self, **kwargs) -> Dict:\n        \"\"\"Prepare and return `aux_input` used by `forward()`\"\"\"\n        return {\"speaker_id\": None, \"style_wav\": None, \"d_vector\": None, \"language_id\": None}\n\n    def get_aux_input_from_test_sentences(self, sentence_info):\n        if hasattr(self.config, \"model_args\"):\n            config = self.config.model_args\n        else:\n            config = self.config\n\n        # extract speaker and language info\n        text, speaker_name, style_wav, language_name = None, None, None, None\n\n        if isinstance(sentence_info, list):\n            if len(sentence_info) == 1:\n                text = sentence_info[0]\n            elif len(sentence_info) == 2:\n                text, speaker_name = sentence_info\n            elif len(sentence_info) == 3:\n                text, speaker_name, style_wav = sentence_info\n            elif len(sentence_info) == 4:\n                text, speaker_name, style_wav, language_name = sentence_info\n        else:\n            text = sentence_info\n\n        # get speaker  id/d_vector\n        speaker_id, d_vector, language_id = None, None, None\n        if self.speaker_manager is not None:\n            if config.use_d_vector_file:\n                if speaker_name is None:\n                    d_vector = self.speaker_manager.get_random_embedding()\n                else:\n                    d_vector = self.speaker_manager.get_d_vector_by_name(speaker_name)\n            elif config.use_speaker_embedding:\n                if speaker_name is None:\n                    speaker_id = self.speaker_manager.get_random_id()\n                else:\n                    speaker_id = self.speaker_manager.name_to_id[speaker_name]\n\n        # get language id\n        if self.language_manager is not None and config.use_language_embedding and language_name is not None:\n            language_id = self.language_manager.name_to_id[language_name]\n\n        return {\n            \"text\": text,\n            \"speaker_id\": speaker_id,\n            \"style_wav\": style_wav,\n            \"d_vector\": d_vector,\n            \"language_id\": language_id,\n        }\n\n    def format_batch(self, batch: Dict) -> Dict:\n        \"\"\"Generic batch formatting for `VCDataset`.\n\n        You must override this if you use a custom dataset.\n\n        Args:\n            batch (Dict): [description]\n\n        Returns:\n            Dict: [description]\n        \"\"\"\n        # setup input batch\n        text_input = batch[\"token_id\"]\n        text_lengths = batch[\"token_id_lengths\"]\n        speaker_names = batch[\"speaker_names\"]\n        linear_input = batch[\"linear\"]\n        mel_input = batch[\"mel\"]\n        mel_lengths = batch[\"mel_lengths\"]\n        stop_targets = batch[\"stop_targets\"]\n        item_idx = batch[\"item_idxs\"]\n        d_vectors = batch[\"d_vectors\"]\n        speaker_ids = batch[\"speaker_ids\"]\n        attn_mask = batch[\"attns\"]\n        waveform = batch[\"waveform\"]\n        pitch = batch[\"pitch\"]\n        energy = batch[\"energy\"]\n        language_ids = batch[\"language_ids\"]\n        max_text_length = torch.max(text_lengths.float())\n        max_spec_length = torch.max(mel_lengths.float())\n\n        # compute durations from attention masks\n        durations = None\n        if attn_mask is not None:\n            durations = torch.zeros(attn_mask.shape[0], attn_mask.shape[2])\n            for idx, am in enumerate(attn_mask):\n                # compute raw durations\n                c_idxs = am[:, : text_lengths[idx], : mel_lengths[idx]].max(1)[1]\n                # c_idxs, counts = torch.unique_consecutive(c_idxs, return_counts=True)\n                c_idxs, counts = torch.unique(c_idxs, return_counts=True)\n                dur = torch.ones([text_lengths[idx]]).to(counts.dtype)\n                dur[c_idxs] = counts\n                # smooth the durations and set any 0 duration to 1\n                # by cutting off from the largest duration indeces.\n                extra_frames = dur.sum() - mel_lengths[idx]\n                largest_idxs = torch.argsort(-dur)[:extra_frames]\n                dur[largest_idxs] -= 1\n                assert (\n                    dur.sum() == mel_lengths[idx]\n                ), f\" [!] total duration {dur.sum()} vs spectrogram length {mel_lengths[idx]}\"\n                durations[idx, : text_lengths[idx]] = dur\n\n        # set stop targets wrt reduction factor\n        stop_targets = stop_targets.view(text_input.shape[0], stop_targets.size(1) // self.config.r, -1)\n        stop_targets = (stop_targets.sum(2) > 0.0).unsqueeze(2).float().squeeze(2)\n        stop_target_lengths = torch.divide(mel_lengths, self.config.r).ceil_()\n\n        return {\n            \"text_input\": text_input,\n            \"text_lengths\": text_lengths,\n            \"speaker_names\": speaker_names,\n            \"mel_input\": mel_input,\n            \"mel_lengths\": mel_lengths,\n            \"linear_input\": linear_input,\n            \"stop_targets\": stop_targets,\n            \"stop_target_lengths\": stop_target_lengths,\n            \"attn_mask\": attn_mask,\n            \"durations\": durations,\n            \"speaker_ids\": speaker_ids,\n            \"d_vectors\": d_vectors,\n            \"max_text_length\": float(max_text_length),\n            \"max_spec_length\": float(max_spec_length),\n            \"item_idx\": item_idx,\n            \"waveform\": waveform,\n            \"pitch\": pitch,\n            \"energy\": energy,\n            \"language_ids\": language_ids,\n            \"audio_unique_names\": batch[\"audio_unique_names\"],\n        }\n\n    def get_sampler(self, config: Coqpit, dataset: TTSDataset, num_gpus=1):\n        weights = None\n        data_items = dataset.samples\n\n        if getattr(config, \"use_language_weighted_sampler\", False):\n            alpha = getattr(config, \"language_weighted_sampler_alpha\", 1.0)\n            print(\" > Using Language weighted sampler with alpha:\", alpha)\n            weights = get_language_balancer_weights(data_items) * alpha\n\n        if getattr(config, \"use_speaker_weighted_sampler\", False):\n            alpha = getattr(config, \"speaker_weighted_sampler_alpha\", 1.0)\n            print(\" > Using Speaker weighted sampler with alpha:\", alpha)\n            if weights is not None:\n                weights += get_speaker_balancer_weights(data_items) * alpha\n            else:\n                weights = get_speaker_balancer_weights(data_items) * alpha\n\n        if getattr(config, \"use_length_weighted_sampler\", False):\n            alpha = getattr(config, \"length_weighted_sampler_alpha\", 1.0)\n            print(\" > Using Length weighted sampler with alpha:\", alpha)\n            if weights is not None:\n                weights += get_length_balancer_weights(data_items) * alpha\n            else:\n                weights = get_length_balancer_weights(data_items) * alpha\n\n        if weights is not None:\n            sampler = WeightedRandomSampler(weights, len(weights))\n        else:\n            sampler = None\n\n        # sampler for DDP\n        if sampler is None:\n            sampler = DistributedSampler(dataset) if num_gpus > 1 else None\n        else:  # If a sampler is already defined use this sampler and DDP sampler together\n            sampler = DistributedSamplerWrapper(sampler) if num_gpus > 1 else sampler\n\n        return sampler\n\n    def get_data_loader(\n        self,\n        config: Coqpit,\n        assets: Dict,\n        is_eval: bool,\n        samples: Union[List[Dict], List[List]],\n        verbose: bool,\n        num_gpus: int,\n        rank: int = None,\n    ) -> \"DataLoader\":\n        if is_eval and not config.run_eval:\n            loader = None\n        else:\n            # setup multi-speaker attributes\n            if self.speaker_manager is not None:\n                if hasattr(config, \"model_args\"):\n                    speaker_id_mapping = (\n                        self.speaker_manager.name_to_id if config.model_args.use_speaker_embedding else None\n                    )\n                    d_vector_mapping = self.speaker_manager.embeddings if config.model_args.use_d_vector_file else None\n                    config.use_d_vector_file = config.model_args.use_d_vector_file\n                else:\n                    speaker_id_mapping = self.speaker_manager.name_to_id if config.use_speaker_embedding else None\n                    d_vector_mapping = self.speaker_manager.embeddings if config.use_d_vector_file else None\n            else:\n                speaker_id_mapping = None\n                d_vector_mapping = None\n\n            # setup multi-lingual attributes\n            if self.language_manager is not None:\n                language_id_mapping = self.language_manager.name_to_id if self.args.use_language_embedding else None\n            else:\n                language_id_mapping = None\n\n            # init dataloader\n            dataset = TTSDataset(\n                outputs_per_step=config.r if \"r\" in config else 1,\n                compute_linear_spec=config.model.lower() == \"tacotron\" or config.compute_linear_spec,\n                compute_f0=config.get(\"compute_f0\", False),\n                f0_cache_path=config.get(\"f0_cache_path\", None),\n                compute_energy=config.get(\"compute_energy\", False),\n                energy_cache_path=config.get(\"energy_cache_path\", None),\n                samples=samples,\n                ap=self.ap,\n                return_wav=config.return_wav if \"return_wav\" in config else False,\n                batch_group_size=0 if is_eval else config.batch_group_size * config.batch_size,\n                min_text_len=config.min_text_len,\n                max_text_len=config.max_text_len,\n                min_audio_len=config.min_audio_len,\n                max_audio_len=config.max_audio_len,\n                phoneme_cache_path=config.phoneme_cache_path,\n                precompute_num_workers=config.precompute_num_workers,\n                use_noise_augment=False if is_eval else config.use_noise_augment,\n                verbose=verbose,\n                speaker_id_mapping=speaker_id_mapping,\n                d_vector_mapping=d_vector_mapping if config.use_d_vector_file else None,\n                tokenizer=None,\n                start_by_longest=config.start_by_longest,\n                language_id_mapping=language_id_mapping,\n            )\n\n            # wait all the DDP process to be ready\n            if num_gpus > 1:\n                dist.barrier()\n\n            # sort input sequences from short to long\n            dataset.preprocess_samples()\n\n            # get samplers\n            sampler = self.get_sampler(config, dataset, num_gpus)\n\n            loader = DataLoader(\n                dataset,\n                batch_size=config.eval_batch_size if is_eval else config.batch_size,\n                shuffle=config.shuffle if sampler is None else False,  # if there is no other sampler\n                collate_fn=dataset.collate_fn,\n                drop_last=config.drop_last,  # setting this False might cause issues in AMP training.\n                sampler=sampler,\n                num_workers=config.num_eval_loader_workers if is_eval else config.num_loader_workers,\n                pin_memory=False,\n            )\n        return loader\n\n    def _get_test_aux_input(\n        self,\n    ) -> Dict:\n        d_vector = None\n        if self.config.use_d_vector_file:\n            d_vector = [self.speaker_manager.embeddings[name][\"embedding\"] for name in self.speaker_manager.embeddings]\n            d_vector = (random.sample(sorted(d_vector), 1),)\n\n        aux_inputs = {\n            \"speaker_id\": None\n            if not self.config.use_speaker_embedding\n            else random.sample(sorted(self.speaker_manager.name_to_id.values()), 1),\n            \"d_vector\": d_vector,\n            \"style_wav\": None,  # TODO: handle GST style input\n        }\n        return aux_inputs\n\n    def test_run(self, assets: Dict) -> Tuple[Dict, Dict]:\n        \"\"\"Generic test run for `vc` models used by `Trainer`.\n\n        You can override this for a different behaviour.\n\n        Args:\n            assets (dict): A dict of training assets. For `vc` models, it must include `{'audio_processor': ap}`.\n\n        Returns:\n            Tuple[Dict, Dict]: Test figures and audios to be projected to Tensorboard.\n        \"\"\"\n        print(\" | > Synthesizing test sentences.\")\n        test_audios = {}\n        test_figures = {}\n        test_sentences = self.config.test_sentences\n        aux_inputs = self._get_test_aux_input()\n        for idx, sen in enumerate(test_sentences):\n            if isinstance(sen, list):\n                aux_inputs = self.get_aux_input_from_test_sentences(sen)\n                sen = aux_inputs[\"text\"]\n            outputs_dict = synthesis(\n                self,\n                sen,\n                self.config,\n                \"cuda\" in str(next(self.parameters()).device),\n                speaker_id=aux_inputs[\"speaker_id\"],\n                d_vector=aux_inputs[\"d_vector\"],\n                style_wav=aux_inputs[\"style_wav\"],\n                use_griffin_lim=True,\n                do_trim_silence=False,\n            )\n            test_audios[\"{}-audio\".format(idx)] = outputs_dict[\"wav\"]\n            test_figures[\"{}-prediction\".format(idx)] = plot_spectrogram(\n                outputs_dict[\"outputs\"][\"model_outputs\"], self.ap, output_fig=False\n            )\n            test_figures[\"{}-alignment\".format(idx)] = plot_alignment(\n                outputs_dict[\"outputs\"][\"alignments\"], output_fig=False\n            )\n        return test_figures, test_audios\n\n    def on_init_start(self, trainer):\n        \"\"\"Save the speaker.pth and language_ids.json at the beginning of the training. Also update both paths.\"\"\"\n        if self.speaker_manager is not None:\n            output_path = os.path.join(trainer.output_path, \"speakers.pth\")\n            self.speaker_manager.save_ids_to_file(output_path)\n            trainer.config.speakers_file = output_path\n            # some models don't have `model_args` set\n            if hasattr(trainer.config, \"model_args\"):\n                trainer.config.model_args.speakers_file = output_path\n            trainer.config.save_json(os.path.join(trainer.output_path, \"config.json\"))\n            print(f\" > `speakers.pth` is saved to {output_path}.\")\n            print(\" > `speakers_file` is updated in the config.json.\")\n\n        if self.language_manager is not None:\n            output_path = os.path.join(trainer.output_path, \"language_ids.json\")\n            self.language_manager.save_ids_to_file(output_path)\n            trainer.config.language_ids_file = output_path\n            if hasattr(trainer.config, \"model_args\"):\n                trainer.config.model_args.language_ids_file = output_path\n            trainer.config.save_json(os.path.join(trainer.output_path, \"config.json\"))\n            print(f\" > `language_ids.json` is saved to {output_path}.\")\n            print(\" > `language_ids_file` is updated in the config.json.\")\n"
  },
  {
    "path": "TTS/vc/models/freevc.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import Dict, List, Optional, Tuple, Union\n\nimport librosa\nimport numpy as np\nimport torch\nfrom coqpit import Coqpit\nfrom torch import nn\nfrom torch.nn import AvgPool1d, Conv1d, Conv2d, ConvTranspose1d\nfrom torch.nn import functional as F\nfrom torch.nn.utils import remove_weight_norm, spectral_norm, weight_norm\n\nimport TTS.vc.modules.freevc.commons as commons\nimport TTS.vc.modules.freevc.modules as modules\nfrom TTS.tts.utils.speakers import SpeakerManager\nfrom TTS.utils.io import load_fsspec, save_checkpoint\nfrom TTS.vc.configs.shared_configs import BaseVCConfig\nfrom TTS.vc.models.base_vc import BaseVC\nfrom TTS.vc.modules.freevc.commons import get_padding, init_weights\nfrom TTS.vc.modules.freevc.mel_processing import mel_spectrogram_torch\nfrom TTS.vc.modules.freevc.speaker_encoder.speaker_encoder import SpeakerEncoder as SpeakerEncoderEx\nfrom TTS.vc.modules.freevc.wavlm import get_wavlm\n\n\nclass ResidualCouplingBlock(nn.Module):\n    def __init__(self, channels, hidden_channels, kernel_size, dilation_rate, n_layers, n_flows=4, gin_channels=0):\n        super().__init__()\n        self.channels = channels\n        self.hidden_channels = hidden_channels\n        self.kernel_size = kernel_size\n        self.dilation_rate = dilation_rate\n        self.n_layers = n_layers\n        self.n_flows = n_flows\n        self.gin_channels = gin_channels\n\n        self.flows = nn.ModuleList()\n        for i in range(n_flows):\n            self.flows.append(\n                modules.ResidualCouplingLayer(\n                    channels,\n                    hidden_channels,\n                    kernel_size,\n                    dilation_rate,\n                    n_layers,\n                    gin_channels=gin_channels,\n                    mean_only=True,\n                )\n            )\n            self.flows.append(modules.Flip())\n\n    def forward(self, x, x_mask, g=None, reverse=False):\n        if not reverse:\n            for flow in self.flows:\n                x, _ = flow(x, x_mask, g=g, reverse=reverse)\n        else:\n            for flow in reversed(self.flows):\n                x = flow(x, x_mask, g=g, reverse=reverse)\n        return x\n\n\nclass Encoder(nn.Module):\n    def __init__(\n        self, in_channels, out_channels, hidden_channels, kernel_size, dilation_rate, n_layers, gin_channels=0\n    ):\n        super().__init__()\n        self.in_channels = in_channels\n        self.out_channels = out_channels\n        self.hidden_channels = hidden_channels\n        self.kernel_size = kernel_size\n        self.dilation_rate = dilation_rate\n        self.n_layers = n_layers\n        self.gin_channels = gin_channels\n\n        self.pre = nn.Conv1d(in_channels, hidden_channels, 1)\n        self.enc = modules.WN(hidden_channels, kernel_size, dilation_rate, n_layers, gin_channels=gin_channels)\n        self.proj = nn.Conv1d(hidden_channels, out_channels * 2, 1)\n\n    def forward(self, x, x_lengths, g=None):\n        x_mask = torch.unsqueeze(commons.sequence_mask(x_lengths, x.size(2)), 1).to(x.dtype)\n        x = self.pre(x) * x_mask\n        x = self.enc(x, x_mask, g=g)\n        stats = self.proj(x) * x_mask\n        m, logs = torch.split(stats, self.out_channels, dim=1)\n        z = (m + torch.randn_like(m) * torch.exp(logs)) * x_mask\n        return z, m, logs, x_mask\n\n\nclass Generator(torch.nn.Module):\n    def __init__(\n        self,\n        initial_channel,\n        resblock,\n        resblock_kernel_sizes,\n        resblock_dilation_sizes,\n        upsample_rates,\n        upsample_initial_channel,\n        upsample_kernel_sizes,\n        gin_channels=0,\n    ):\n        super(Generator, self).__init__()\n        self.num_kernels = len(resblock_kernel_sizes)\n        self.num_upsamples = len(upsample_rates)\n        self.conv_pre = Conv1d(initial_channel, upsample_initial_channel, 7, 1, padding=3)\n        resblock = modules.ResBlock1 if resblock == \"1\" else modules.ResBlock2\n\n        self.ups = nn.ModuleList()\n        for i, (u, k) in enumerate(zip(upsample_rates, upsample_kernel_sizes)):\n            self.ups.append(\n                weight_norm(\n                    ConvTranspose1d(\n                        upsample_initial_channel // (2**i),\n                        upsample_initial_channel // (2 ** (i + 1)),\n                        k,\n                        u,\n                        padding=(k - u) // 2,\n                    )\n                )\n            )\n\n        self.resblocks = nn.ModuleList()\n        for i in range(len(self.ups)):\n            ch = upsample_initial_channel // (2 ** (i + 1))\n            for j, (k, d) in enumerate(zip(resblock_kernel_sizes, resblock_dilation_sizes)):\n                self.resblocks.append(resblock(ch, k, d))\n\n        self.conv_post = Conv1d(ch, 1, 7, 1, padding=3, bias=False)\n        self.ups.apply(init_weights)\n\n        if gin_channels != 0:\n            self.cond = nn.Conv1d(gin_channels, upsample_initial_channel, 1)\n\n    def forward(self, x, g=None):\n        x = self.conv_pre(x)\n        if g is not None:\n            x = x + self.cond(g)\n\n        for i in range(self.num_upsamples):\n            x = F.leaky_relu(x, modules.LRELU_SLOPE)\n            x = self.ups[i](x)\n            xs = None\n            for j in range(self.num_kernels):\n                if xs is None:\n                    xs = self.resblocks[i * self.num_kernels + j](x)\n                else:\n                    xs += self.resblocks[i * self.num_kernels + j](x)\n            x = xs / self.num_kernels\n        x = F.leaky_relu(x)\n        x = self.conv_post(x)\n        x = torch.tanh(x)\n\n        return x\n\n    def remove_weight_norm(self):\n        print(\"Removing weight norm...\")\n        for l in self.ups:\n            remove_weight_norm(l)\n        for l in self.resblocks:\n            l.remove_weight_norm()\n\n\nclass DiscriminatorP(torch.nn.Module):\n    def __init__(self, period, kernel_size=5, stride=3, use_spectral_norm=False):\n        super(DiscriminatorP, self).__init__()\n        self.period = period\n        self.use_spectral_norm = use_spectral_norm\n        norm_f = weight_norm if use_spectral_norm == False else spectral_norm\n        self.convs = nn.ModuleList(\n            [\n                norm_f(Conv2d(1, 32, (kernel_size, 1), (stride, 1), padding=(get_padding(kernel_size, 1), 0))),\n                norm_f(Conv2d(32, 128, (kernel_size, 1), (stride, 1), padding=(get_padding(kernel_size, 1), 0))),\n                norm_f(Conv2d(128, 512, (kernel_size, 1), (stride, 1), padding=(get_padding(kernel_size, 1), 0))),\n                norm_f(Conv2d(512, 1024, (kernel_size, 1), (stride, 1), padding=(get_padding(kernel_size, 1), 0))),\n                norm_f(Conv2d(1024, 1024, (kernel_size, 1), 1, padding=(get_padding(kernel_size, 1), 0))),\n            ]\n        )\n        self.conv_post = norm_f(Conv2d(1024, 1, (3, 1), 1, padding=(1, 0)))\n\n    def forward(self, x):\n        fmap = []\n\n        # 1d to 2d\n        b, c, t = x.shape\n        if t % self.period != 0:  # pad first\n            n_pad = self.period - (t % self.period)\n            x = F.pad(x, (0, n_pad), \"reflect\")\n            t = t + n_pad\n        x = x.view(b, c, t // self.period, self.period)\n\n        for l in self.convs:\n            x = l(x)\n            x = F.leaky_relu(x, modules.LRELU_SLOPE)\n            fmap.append(x)\n        x = self.conv_post(x)\n        fmap.append(x)\n        x = torch.flatten(x, 1, -1)\n\n        return x, fmap\n\n\nclass DiscriminatorS(torch.nn.Module):\n    def __init__(self, use_spectral_norm=False):\n        super(DiscriminatorS, self).__init__()\n        norm_f = weight_norm if use_spectral_norm == False else spectral_norm\n        self.convs = nn.ModuleList(\n            [\n                norm_f(Conv1d(1, 16, 15, 1, padding=7)),\n                norm_f(Conv1d(16, 64, 41, 4, groups=4, padding=20)),\n                norm_f(Conv1d(64, 256, 41, 4, groups=16, padding=20)),\n                norm_f(Conv1d(256, 1024, 41, 4, groups=64, padding=20)),\n                norm_f(Conv1d(1024, 1024, 41, 4, groups=256, padding=20)),\n                norm_f(Conv1d(1024, 1024, 5, 1, padding=2)),\n            ]\n        )\n        self.conv_post = norm_f(Conv1d(1024, 1, 3, 1, padding=1))\n\n    def forward(self, x):\n        fmap = []\n\n        for l in self.convs:\n            x = l(x)\n            x = F.leaky_relu(x, modules.LRELU_SLOPE)\n            fmap.append(x)\n        x = self.conv_post(x)\n        fmap.append(x)\n        x = torch.flatten(x, 1, -1)\n\n        return x, fmap\n\n\nclass MultiPeriodDiscriminator(torch.nn.Module):\n    def __init__(self, use_spectral_norm=False):\n        super(MultiPeriodDiscriminator, self).__init__()\n        periods = [2, 3, 5, 7, 11]\n\n        discs = [DiscriminatorS(use_spectral_norm=use_spectral_norm)]\n        discs = discs + [DiscriminatorP(i, use_spectral_norm=use_spectral_norm) for i in periods]\n        self.discriminators = nn.ModuleList(discs)\n\n    def forward(self, y, y_hat):\n        y_d_rs = []\n        y_d_gs = []\n        fmap_rs = []\n        fmap_gs = []\n        for i, d in enumerate(self.discriminators):\n            y_d_r, fmap_r = d(y)\n            y_d_g, fmap_g = d(y_hat)\n            y_d_rs.append(y_d_r)\n            y_d_gs.append(y_d_g)\n            fmap_rs.append(fmap_r)\n            fmap_gs.append(fmap_g)\n\n        return y_d_rs, y_d_gs, fmap_rs, fmap_gs\n\n\nclass SpeakerEncoder(torch.nn.Module):\n    def __init__(self, mel_n_channels=80, model_num_layers=3, model_hidden_size=256, model_embedding_size=256):\n        super(SpeakerEncoder, self).__init__()\n        self.lstm = nn.LSTM(mel_n_channels, model_hidden_size, model_num_layers, batch_first=True)\n        self.linear = nn.Linear(model_hidden_size, model_embedding_size)\n        self.relu = nn.ReLU()\n\n    def forward(self, mels):\n        self.lstm.flatten_parameters()\n        _, (hidden, _) = self.lstm(mels)\n        embeds_raw = self.relu(self.linear(hidden[-1]))\n        return embeds_raw / torch.norm(embeds_raw, dim=1, keepdim=True)\n\n    def compute_partial_slices(self, total_frames, partial_frames, partial_hop):\n        mel_slices = []\n        for i in range(0, total_frames - partial_frames, partial_hop):\n            mel_range = torch.arange(i, i + partial_frames)\n            mel_slices.append(mel_range)\n\n        return mel_slices\n\n    def embed_utterance(self, mel, partial_frames=128, partial_hop=64):\n        mel_len = mel.size(1)\n        last_mel = mel[:, -partial_frames:]\n\n        if mel_len > partial_frames:\n            mel_slices = self.compute_partial_slices(mel_len, partial_frames, partial_hop)\n            mels = list(mel[:, s] for s in mel_slices)\n            mels.append(last_mel)\n            mels = torch.stack(tuple(mels), 0).squeeze(1)\n\n            with torch.no_grad():\n                partial_embeds = self(mels)\n            embed = torch.mean(partial_embeds, axis=0).unsqueeze(0)\n            # embed = embed / torch.linalg.norm(embed, 2)\n        else:\n            with torch.no_grad():\n                embed = self(last_mel)\n\n        return embed\n\n\n@dataclass\nclass FreeVCAudioConfig(Coqpit):\n    \"\"\"Audio configuration\n\n    Args:\n        max_wav_value (float):\n            The maximum value of the waveform.\n\n        input_sample_rate (int):\n            The sampling rate of the input waveform.\n\n        output_sample_rate (int):\n            The sampling rate of the output waveform.\n\n        filter_length (int):\n            The length of the filter.\n\n        hop_length (int):\n            The hop length.\n\n        win_length (int):\n            The window length.\n\n        n_mel_channels (int):\n            The number of mel channels.\n\n        mel_fmin (float):\n            The minimum frequency of the mel filterbank.\n\n        mel_fmax (Optional[float]):\n            The maximum frequency of the mel filterbank.\n    \"\"\"\n\n    max_wav_value: float = field(default=32768.0)\n    input_sample_rate: int = field(default=16000)\n    output_sample_rate: int = field(default=24000)\n    filter_length: int = field(default=1280)\n    hop_length: int = field(default=320)\n    win_length: int = field(default=1280)\n    n_mel_channels: int = field(default=80)\n    mel_fmin: float = field(default=0.0)\n    mel_fmax: Optional[float] = field(default=None)\n\n\n@dataclass\nclass FreeVCArgs(Coqpit):\n    \"\"\"FreeVC model arguments\n\n    Args:\n        spec_channels (int):\n            The number of channels in the spectrogram.\n\n        inter_channels (int):\n            The number of channels in the intermediate layers.\n\n        hidden_channels (int):\n            The number of channels in the hidden layers.\n\n        filter_channels (int):\n            The number of channels in the filter layers.\n\n        n_heads (int):\n            The number of attention heads.\n\n        n_layers (int):\n            The number of layers.\n\n        kernel_size (int):\n            The size of the kernel.\n\n        p_dropout (float):\n            The dropout probability.\n\n        resblock (str):\n            The type of residual block.\n\n        resblock_kernel_sizes (List[int]):\n            The kernel sizes for the residual blocks.\n\n        resblock_dilation_sizes (List[List[int]]):\n            The dilation sizes for the residual blocks.\n\n        upsample_rates (List[int]):\n            The upsample rates.\n\n        upsample_initial_channel (int):\n            The number of channels in the initial upsample layer.\n\n        upsample_kernel_sizes (List[int]):\n            The kernel sizes for the upsample layers.\n\n        n_layers_q (int):\n            The number of layers in the quantization network.\n\n        use_spectral_norm (bool):\n            Whether to use spectral normalization.\n\n        gin_channels (int):\n            The number of channels in the global conditioning vector.\n\n        ssl_dim (int):\n            The dimension of the self-supervised learning embedding.\n\n        use_spk (bool):\n            Whether to use external speaker encoder.\n    \"\"\"\n\n    spec_channels: int = field(default=641)\n    inter_channels: int = field(default=192)\n    hidden_channels: int = field(default=192)\n    filter_channels: int = field(default=768)\n    n_heads: int = field(default=2)\n    n_layers: int = field(default=6)\n    kernel_size: int = field(default=3)\n    p_dropout: float = field(default=0.1)\n    resblock: str = field(default=\"1\")\n    resblock_kernel_sizes: List[int] = field(default_factory=lambda: [3, 7, 11])\n    resblock_dilation_sizes: List[List[int]] = field(default_factory=lambda: [[1, 3, 5], [1, 3, 5], [1, 3, 5]])\n    upsample_rates: List[int] = field(default_factory=lambda: [10, 8, 2, 2])\n    upsample_initial_channel: int = field(default=512)\n    upsample_kernel_sizes: List[int] = field(default_factory=lambda: [16, 16, 4, 4])\n    n_layers_q: int = field(default=3)\n    use_spectral_norm: bool = field(default=False)\n    gin_channels: int = field(default=256)\n    ssl_dim: int = field(default=1024)\n    use_spk: bool = field(default=False)\n    num_spks: int = field(default=0)\n    segment_size: int = field(default=8960)\n\n\nclass FreeVC(BaseVC):\n    \"\"\"\n\n    Papaer::\n        https://arxiv.org/abs/2210.15418#\n\n    Paper Abstract::\n        Voice conversion (VC) can be achieved by first extracting source content information and target speaker\n        information, and then reconstructing waveform with these information. However, current approaches normally\n        either extract dirty content information with speaker information leaked in, or demand a large amount of\n        annotated data for training. Besides, the quality of reconstructed waveform can be degraded by the\n        mismatch between conversion model and vocoder. In this paper, we adopt the end-to-end framework of VITS for\n        high-quality waveform reconstruction, and propose strategies for clean content information extraction without\n        text annotation. We disentangle content information by imposing an information bottleneck to WavLM features,\n        and propose the spectrogram-resize based data augmentation to improve the purity of extracted content\n        information. Experimental results show that the proposed method outperforms the latest VC models trained with\n        annotated data and has greater robustness.\n\n    Original Code::\n        https://github.com/OlaWod/FreeVC\n\n    Examples:\n        >>> from TTS.vc.configs.freevc_config import FreeVCConfig\n        >>> from TTS.vc.models.freevc import FreeVC\n        >>> config = FreeVCConfig()\n        >>> model = FreeVC(config)\n    \"\"\"\n\n    def __init__(self, config: Coqpit, speaker_manager: SpeakerManager = None):\n        super().__init__(config, None, speaker_manager, None)\n\n        self.init_multispeaker(config)\n\n        self.spec_channels = self.args.spec_channels\n        self.inter_channels = self.args.inter_channels\n        self.hidden_channels = self.args.hidden_channels\n        self.filter_channels = self.args.filter_channels\n        self.n_heads = self.args.n_heads\n        self.n_layers = self.args.n_layers\n        self.kernel_size = self.args.kernel_size\n        self.p_dropout = self.args.p_dropout\n        self.resblock = self.args.resblock\n        self.resblock_kernel_sizes = self.args.resblock_kernel_sizes\n        self.resblock_dilation_sizes = self.args.resblock_dilation_sizes\n        self.upsample_rates = self.args.upsample_rates\n        self.upsample_initial_channel = self.args.upsample_initial_channel\n        self.upsample_kernel_sizes = self.args.upsample_kernel_sizes\n        self.segment_size = self.args.segment_size\n        self.gin_channels = self.args.gin_channels\n        self.ssl_dim = self.args.ssl_dim\n        self.use_spk = self.args.use_spk\n\n        self.enc_p = Encoder(self.args.ssl_dim, self.inter_channels, self.hidden_channels, 5, 1, 16)\n        self.dec = Generator(\n            self.inter_channels,\n            self.resblock,\n            self.resblock_kernel_sizes,\n            self.resblock_dilation_sizes,\n            self.upsample_rates,\n            self.upsample_initial_channel,\n            self.upsample_kernel_sizes,\n            gin_channels=self.gin_channels,\n        )\n        self.enc_q = Encoder(\n            self.spec_channels, self.inter_channels, self.hidden_channels, 5, 1, 16, gin_channels=self.gin_channels\n        )\n        self.flow = ResidualCouplingBlock(\n            self.inter_channels, self.hidden_channels, 5, 1, 4, gin_channels=self.gin_channels\n        )\n        if not self.use_spk:\n            self.enc_spk = SpeakerEncoder(model_hidden_size=self.gin_channels, model_embedding_size=self.gin_channels)\n        else:\n            self.load_pretrained_speaker_encoder()\n\n        self.wavlm = get_wavlm()\n\n    @property\n    def device(self):\n        return next(self.parameters()).device\n\n    def load_pretrained_speaker_encoder(self):\n        \"\"\"Load pretrained speaker encoder model as mentioned in the paper.\"\"\"\n        print(\" > Loading pretrained speaker encoder model ...\")\n        self.enc_spk_ex = SpeakerEncoderEx(\n            \"https://github.com/coqui-ai/TTS/releases/download/v0.13.0_models/speaker_encoder.pt\"\n        )\n\n    def init_multispeaker(self, config: Coqpit):\n        \"\"\"Initialize multi-speaker modules of a model. A model can be trained either with a speaker embedding layer\n        or with external `d_vectors` computed from a speaker encoder model.\n\n        You must provide a `speaker_manager` at initialization to set up the multi-speaker modules.\n\n        Args:\n            config (Coqpit): Model configuration.\n            data (List, optional): Dataset items to infer number of speakers. Defaults to None.\n        \"\"\"\n        self.num_spks = self.args.num_spks\n        if self.speaker_manager:\n            self.num_spks = self.speaker_manager.num_spks\n\n    def forward(\n        self,\n        c: torch.Tensor,\n        spec: torch.Tensor,\n        g: Optional[torch.Tensor] = None,\n        mel: Optional[torch.Tensor] = None,\n        c_lengths: Optional[torch.Tensor] = None,\n        spec_lengths: Optional[torch.Tensor] = None,\n    ) -> Tuple[\n        torch.Tensor,\n        torch.Tensor,\n        torch.Tensor,\n        Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor],\n    ]:\n        \"\"\"\n        Forward pass of the model.\n\n        Args:\n            c: WavLM features. Shape: (batch_size, c_seq_len).\n            spec: The input spectrogram. Shape: (batch_size, spec_seq_len, spec_dim).\n            g: The speaker embedding. Shape: (batch_size, spk_emb_dim).\n            mel: The input mel-spectrogram for the speaker encoder. Shape: (batch_size, mel_seq_len, mel_dim).\n            c_lengths: The lengths of the WavLM features. Shape: (batch_size,).\n            spec_lengths: The lengths of the spectrogram. Shape: (batch_size,).\n\n        Returns:\n            o: The output spectrogram. Shape: (batch_size, spec_seq_len, spec_dim).\n            ids_slice: The slice indices. Shape: (batch_size, num_slices).\n            spec_mask: The spectrogram mask. Shape: (batch_size, spec_seq_len).\n            (z, z_p, m_p, logs_p, m_q, logs_q): A tuple of latent variables.\n        \"\"\"\n\n        # If c_lengths is None, set it to the length of the last dimension of c\n        if c_lengths is None:\n            c_lengths = (torch.ones(c.size(0)) * c.size(-1)).to(c.device)\n\n        # If spec_lengths is None, set it to the length of the last dimension of spec\n        if spec_lengths is None:\n            spec_lengths = (torch.ones(spec.size(0)) * spec.size(-1)).to(spec.device)\n\n        # If use_spk is False, compute g from mel using enc_spk\n        g = None\n        if not self.use_spk:\n            g = self.enc_spk(mel).unsqueeze(-1)\n\n        # Compute m_p, logs_p, z, m_q, logs_q, and spec_mask using enc_p and enc_q\n        _, m_p, logs_p, _ = self.enc_p(c, c_lengths)\n        z, m_q, logs_q, spec_mask = self.enc_q(spec.transpose(1, 2), spec_lengths, g=g)\n\n        # Compute z_p using flow\n        z_p = self.flow(z, spec_mask, g=g)\n\n        # Randomly slice z and compute o using dec\n        z_slice, ids_slice = commons.rand_slice_segments(z, spec_lengths, self.segment_size)\n        o = self.dec(z_slice, g=g)\n\n        return o, ids_slice, spec_mask, (z, z_p, m_p, logs_p, m_q, logs_q)\n\n    @torch.no_grad()\n    def inference(self, c, g=None, mel=None, c_lengths=None):\n        \"\"\"\n        Inference pass of the model\n\n        Args:\n            c (torch.Tensor): Input tensor. Shape: (batch_size, c_seq_len).\n            g (torch.Tensor): Speaker embedding tensor. Shape: (batch_size, spk_emb_dim).\n            mel (torch.Tensor): Mel-spectrogram tensor. Shape: (batch_size, mel_seq_len, mel_dim).\n            c_lengths (torch.Tensor): Lengths of the input tensor. Shape: (batch_size,).\n\n        Returns:\n            torch.Tensor: Output tensor.\n        \"\"\"\n        if c_lengths == None:\n            c_lengths = (torch.ones(c.size(0)) * c.size(-1)).to(c.device)\n        if not self.use_spk:\n            g = self.enc_spk.embed_utterance(mel)\n            g = g.unsqueeze(-1)\n        z_p, m_p, logs_p, c_mask = self.enc_p(c, c_lengths)\n        z = self.flow(z_p, c_mask, g=g, reverse=True)\n        o = self.dec(z * c_mask, g=g)\n        return o\n\n    def extract_wavlm_features(self, y):\n        \"\"\"Extract WavLM features from an audio tensor.\n\n        Args:\n            y (torch.Tensor): Audio tensor. Shape: (batch_size, audio_seq_len).\n        \"\"\"\n\n        with torch.no_grad():\n            c = self.wavlm.extract_features(y)[0]\n        c = c.transpose(1, 2)\n        return c\n\n    def load_audio(self, wav):\n        \"\"\"Read and format the input audio.\"\"\"\n        if isinstance(wav, str):\n            wav, _ = librosa.load(wav, sr=self.config.audio.input_sample_rate)\n        if isinstance(wav, np.ndarray):\n            wav = torch.from_numpy(wav).to(self.device)\n        if isinstance(wav, torch.Tensor):\n            wav = wav.to(self.device)\n        if isinstance(wav, list):\n            wav = torch.from_numpy(np.array(wav)).to(self.device)\n        return wav.float()\n\n    @torch.inference_mode()\n    def voice_conversion(self, src, tgt):\n        \"\"\"\n        Voice conversion pass of the model.\n\n        Args:\n            src (str or torch.Tensor): Source utterance.\n            tgt (str or torch.Tensor): Target utterance.\n\n        Returns:\n            torch.Tensor: Output tensor.\n        \"\"\"\n\n        wav_tgt = self.load_audio(tgt).cpu().numpy()\n        wav_tgt, _ = librosa.effects.trim(wav_tgt, top_db=20)\n\n        if self.config.model_args.use_spk:\n            g_tgt = self.enc_spk_ex.embed_utterance(wav_tgt)\n            g_tgt = torch.from_numpy(g_tgt)[None, :, None].to(self.device)\n        else:\n            wav_tgt = torch.from_numpy(wav_tgt).unsqueeze(0).to(self.device)\n            mel_tgt = mel_spectrogram_torch(\n                wav_tgt,\n                self.config.audio.filter_length,\n                self.config.audio.n_mel_channels,\n                self.config.audio.input_sample_rate,\n                self.config.audio.hop_length,\n                self.config.audio.win_length,\n                self.config.audio.mel_fmin,\n                self.config.audio.mel_fmax,\n            )\n        # src\n        wav_src = self.load_audio(src)\n        c = self.extract_wavlm_features(wav_src[None, :])\n\n        if self.config.model_args.use_spk:\n            audio = self.inference(c, g=g_tgt)\n        else:\n            audio = self.inference(c, mel=mel_tgt.transpose(1, 2))\n        audio = audio[0][0].data.cpu().float().numpy()\n        return audio\n\n    def eval_step():\n        ...\n\n    @staticmethod\n    def init_from_config(config: \"VitsConfig\", samples: Union[List[List], List[Dict]] = None, verbose=True):\n        model = FreeVC(config)\n        return model\n\n    def load_checkpoint(self, config, checkpoint_path, eval=False, strict=True, cache=False):\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), cache=cache)\n        self.load_state_dict(state[\"model\"], strict=strict)\n        if eval:\n            self.eval()\n\n    def train_step():\n        ...\n\n\n@dataclass\nclass FreeVCConfig(BaseVCConfig):\n    \"\"\"Defines parameters for FreeVC End2End TTS model.\n\n    Args:\n        model (str):\n            Model name. Do not change unless you know what you are doing.\n\n        model_args (FreeVCArgs):\n            Model architecture arguments. Defaults to `FreeVCArgs()`.\n\n        audio (FreeVCAudioConfig):\n            Audio processing configuration. Defaults to `FreeVCAudioConfig()`.\n\n        grad_clip (List):\n            Gradient clipping thresholds for each optimizer. Defaults to `[1000.0, 1000.0]`.\n\n        lr_gen (float):\n            Initial learning rate for the generator. Defaults to 0.0002.\n\n        lr_disc (float):\n            Initial learning rate for the discriminator. Defaults to 0.0002.\n\n        lr_scheduler_gen (str):\n            Name of the learning rate scheduler for the generator. One of the `torch.optim.lr_scheduler.*`. Defaults to\n            `ExponentialLR`.\n\n        lr_scheduler_gen_params (dict):\n            Parameters for the learning rate scheduler of the generator. Defaults to `{'gamma': 0.999875, \"last_epoch\":-1}`.\n\n        lr_scheduler_disc (str):\n            Name of the learning rate scheduler for the discriminator. One of the `torch.optim.lr_scheduler.*`. Defaults to\n            `ExponentialLR`.\n\n        lr_scheduler_disc_params (dict):\n            Parameters for the learning rate scheduler of the discriminator. Defaults to `{'gamma': 0.999875, \"last_epoch\":-1}`.\n\n        scheduler_after_epoch (bool):\n            If true, step the schedulers after each epoch else after each step. Defaults to `False`.\n\n        optimizer (str):\n            Name of the optimizer to use with both the generator and the discriminator networks. One of the\n            `torch.optim.*`. Defaults to `AdamW`.\n\n        kl_loss_alpha (float):\n            Loss weight for KL loss. Defaults to 1.0.\n\n        disc_loss_alpha (float):\n            Loss weight for the discriminator loss. Defaults to 1.0.\n\n        gen_loss_alpha (float):\n            Loss weight for the generator loss. Defaults to 1.0.\n\n        feat_loss_alpha (float):\n            Loss weight for the feature matching loss. Defaults to 1.0.\n\n        mel_loss_alpha (float):\n            Loss weight for the mel loss. Defaults to 45.0.\n\n        return_wav (bool):\n            If true, data loader returns the waveform as well as the other outputs. Do not change. Defaults to `True`.\n\n        compute_linear_spec (bool):\n            If true, the linear spectrogram is computed and returned alongside the mel output. Do not change. Defaults to `True`.\n\n        use_weighted_sampler (bool):\n            If true, use weighted sampler with bucketing for balancing samples between datasets used in training. Defaults to `False`.\n\n        weighted_sampler_attrs (dict):\n            Key retuned by the formatter to be used for weighted sampler. For example `{\"root_path\": 2.0, \"speaker_name\": 1.0}` sets sample probabilities\n            by overweighting `root_path` by 2.0. Defaults to `{}`.\n\n        weighted_sampler_multipliers (dict):\n            Weight each unique value of a key returned by the formatter for weighted sampling.\n            For example `{\"root_path\":{\"/raid/datasets/libritts-clean-16khz-bwe-coqui_44khz/LibriTTS/train-clean-100/\":1.0, \"/raid/datasets/libritts-clean-16khz-bwe-coqui_44khz/LibriTTS/train-clean-360/\": 0.5}`.\n            It will sample instances from `train-clean-100` 2 times more than `train-clean-360`. Defaults to `{}`.\n\n        r (int):\n            Number of spectrogram frames to be generated at a time. Do not change. Defaults to `1`.\n\n        add_blank (bool):\n            If true, a blank token is added in between every character. Defaults to `True`.\n\n        test_sentences (List[List]):\n            List of sentences with speaker and language information to be used for testing.\n\n        language_ids_file (str):\n            Path to the language ids file.\n\n        use_language_embedding (bool):\n            If true, language embedding is used. Defaults to `False`.\n\n    Note:\n        Check :class:`TTS.tts.configs.shared_configs.BaseTTSConfig` for the inherited parameters.\n\n    Example:\n\n        >>> from TTS.tts.configs.freevc_config import FreeVCConfig\n        >>> config = FreeVCConfig()\n    \"\"\"\n\n    model: str = \"freevc\"\n    # model specific params\n    model_args: FreeVCArgs = FreeVCArgs()\n    audio: FreeVCAudioConfig = FreeVCAudioConfig()\n\n    # optimizer\n    # TODO with training support\n\n    # loss params\n    # TODO with training support\n\n    # data loader params\n    return_wav: bool = True\n    compute_linear_spec: bool = True\n\n    # sampler params\n    use_weighted_sampler: bool = False  # TODO: move it to the base config\n    weighted_sampler_attrs: dict = field(default_factory=lambda: {})\n    weighted_sampler_multipliers: dict = field(default_factory=lambda: {})\n\n    # overrides\n    r: int = 1  # DO NOT CHANGE\n    add_blank: bool = True\n\n    # multi-speaker settings\n    # use speaker embedding layer\n    num_speakers: int = 0\n    speakers_file: str = None\n    speaker_embedding_channels: int = 256\n\n    # use d-vectors\n    use_d_vector_file: bool = False\n    d_vector_file: List[str] = None\n    d_vector_dim: int = None\n\n    def __post_init__(self):\n        for key, val in self.model_args.items():\n            if hasattr(self, key):\n                self[key] = val\n"
  },
  {
    "path": "TTS/vc/modules/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/vc/modules/freevc/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/vc/modules/freevc/commons.py",
    "content": "import math\n\nimport numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import functional as F\n\n\ndef init_weights(m, mean=0.0, std=0.01):\n    classname = m.__class__.__name__\n    if classname.find(\"Conv\") != -1:\n        m.weight.data.normal_(mean, std)\n\n\ndef get_padding(kernel_size, dilation=1):\n    return int((kernel_size * dilation - dilation) / 2)\n\n\ndef convert_pad_shape(pad_shape):\n    l = pad_shape[::-1]\n    pad_shape = [item for sublist in l for item in sublist]\n    return pad_shape\n\n\ndef intersperse(lst, item):\n    result = [item] * (len(lst) * 2 + 1)\n    result[1::2] = lst\n    return result\n\n\ndef kl_divergence(m_p, logs_p, m_q, logs_q):\n    \"\"\"KL(P||Q)\"\"\"\n    kl = (logs_q - logs_p) - 0.5\n    kl += 0.5 * (torch.exp(2.0 * logs_p) + ((m_p - m_q) ** 2)) * torch.exp(-2.0 * logs_q)\n    return kl\n\n\ndef rand_gumbel(shape):\n    \"\"\"Sample from the Gumbel distribution, protect from overflows.\"\"\"\n    uniform_samples = torch.rand(shape) * 0.99998 + 0.00001\n    return -torch.log(-torch.log(uniform_samples))\n\n\ndef rand_gumbel_like(x):\n    g = rand_gumbel(x.size()).to(dtype=x.dtype, device=x.device)\n    return g\n\n\ndef slice_segments(x, ids_str, segment_size=4):\n    ret = torch.zeros_like(x[:, :, :segment_size])\n    for i in range(x.size(0)):\n        idx_str = ids_str[i]\n        idx_end = idx_str + segment_size\n        ret[i] = x[i, :, idx_str:idx_end]\n    return ret\n\n\ndef rand_slice_segments(x, x_lengths=None, segment_size=4):\n    b, d, t = x.size()\n    if x_lengths is None:\n        x_lengths = t\n    ids_str_max = x_lengths - segment_size + 1\n    ids_str = (torch.rand([b]).to(device=x.device) * ids_str_max).to(dtype=torch.long)\n    ret = slice_segments(x, ids_str, segment_size)\n    return ret, ids_str\n\n\ndef rand_spec_segments(x, x_lengths=None, segment_size=4):\n    b, d, t = x.size()\n    if x_lengths is None:\n        x_lengths = t\n    ids_str_max = x_lengths - segment_size\n    ids_str = (torch.rand([b]).to(device=x.device) * ids_str_max).to(dtype=torch.long)\n    ret = slice_segments(x, ids_str, segment_size)\n    return ret, ids_str\n\n\ndef get_timing_signal_1d(length, channels, min_timescale=1.0, max_timescale=1.0e4):\n    position = torch.arange(length, dtype=torch.float)\n    num_timescales = channels // 2\n    log_timescale_increment = math.log(float(max_timescale) / float(min_timescale)) / (num_timescales - 1)\n    inv_timescales = min_timescale * torch.exp(\n        torch.arange(num_timescales, dtype=torch.float) * -log_timescale_increment\n    )\n    scaled_time = position.unsqueeze(0) * inv_timescales.unsqueeze(1)\n    signal = torch.cat([torch.sin(scaled_time), torch.cos(scaled_time)], 0)\n    signal = F.pad(signal, [0, 0, 0, channels % 2])\n    signal = signal.view(1, channels, length)\n    return signal\n\n\ndef add_timing_signal_1d(x, min_timescale=1.0, max_timescale=1.0e4):\n    b, channels, length = x.size()\n    signal = get_timing_signal_1d(length, channels, min_timescale, max_timescale)\n    return x + signal.to(dtype=x.dtype, device=x.device)\n\n\ndef cat_timing_signal_1d(x, min_timescale=1.0, max_timescale=1.0e4, axis=1):\n    b, channels, length = x.size()\n    signal = get_timing_signal_1d(length, channels, min_timescale, max_timescale)\n    return torch.cat([x, signal.to(dtype=x.dtype, device=x.device)], axis)\n\n\ndef subsequent_mask(length):\n    mask = torch.tril(torch.ones(length, length)).unsqueeze(0).unsqueeze(0)\n    return mask\n\n\n@torch.jit.script\ndef fused_add_tanh_sigmoid_multiply(input_a, input_b, n_channels):\n    n_channels_int = n_channels[0]\n    in_act = input_a + input_b\n    t_act = torch.tanh(in_act[:, :n_channels_int, :])\n    s_act = torch.sigmoid(in_act[:, n_channels_int:, :])\n    acts = t_act * s_act\n    return acts\n\n\ndef convert_pad_shape(pad_shape):\n    l = pad_shape[::-1]\n    pad_shape = [item for sublist in l for item in sublist]\n    return pad_shape\n\n\ndef shift_1d(x):\n    x = F.pad(x, convert_pad_shape([[0, 0], [0, 0], [1, 0]]))[:, :, :-1]\n    return x\n\n\ndef sequence_mask(length, max_length=None):\n    if max_length is None:\n        max_length = length.max()\n    x = torch.arange(max_length, dtype=length.dtype, device=length.device)\n    return x.unsqueeze(0) < length.unsqueeze(1)\n\n\ndef generate_path(duration, mask):\n    \"\"\"\n    duration: [b, 1, t_x]\n    mask: [b, 1, t_y, t_x]\n    \"\"\"\n    device = duration.device\n\n    b, _, t_y, t_x = mask.shape\n    cum_duration = torch.cumsum(duration, -1)\n\n    cum_duration_flat = cum_duration.view(b * t_x)\n    path = sequence_mask(cum_duration_flat, t_y).to(mask.dtype)\n    path = path.view(b, t_x, t_y)\n    path = path - F.pad(path, convert_pad_shape([[0, 0], [1, 0], [0, 0]]))[:, :-1]\n    path = path.unsqueeze(1).transpose(2, 3) * mask\n    return path\n\n\ndef clip_grad_value_(parameters, clip_value, norm_type=2):\n    if isinstance(parameters, torch.Tensor):\n        parameters = [parameters]\n    parameters = list(filter(lambda p: p.grad is not None, parameters))\n    norm_type = float(norm_type)\n    if clip_value is not None:\n        clip_value = float(clip_value)\n\n    total_norm = 0\n    for p in parameters:\n        param_norm = p.grad.data.norm(norm_type)\n        total_norm += param_norm.item() ** norm_type\n        if clip_value is not None:\n            p.grad.data.clamp_(min=-clip_value, max=clip_value)\n    total_norm = total_norm ** (1.0 / norm_type)\n    return total_norm\n"
  },
  {
    "path": "TTS/vc/modules/freevc/mel_processing.py",
    "content": "import torch\nimport torch.utils.data\nfrom librosa.filters import mel as librosa_mel_fn\n\nMAX_WAV_VALUE = 32768.0\n\n\ndef dynamic_range_compression_torch(x, C=1, clip_val=1e-5):\n    \"\"\"\n    PARAMS\n    ------\n    C: compression factor\n    \"\"\"\n    return torch.log(torch.clamp(x, min=clip_val) * C)\n\n\ndef dynamic_range_decompression_torch(x, C=1):\n    \"\"\"\n    PARAMS\n    ------\n    C: compression factor used to compress\n    \"\"\"\n    return torch.exp(x) / C\n\n\ndef spectral_normalize_torch(magnitudes):\n    output = dynamic_range_compression_torch(magnitudes)\n    return output\n\n\ndef spectral_de_normalize_torch(magnitudes):\n    output = dynamic_range_decompression_torch(magnitudes)\n    return output\n\n\nmel_basis = {}\nhann_window = {}\n\n\ndef spectrogram_torch(y, n_fft, sampling_rate, hop_size, win_size, center=False):\n    if torch.min(y) < -1.0:\n        print(\"min value is \", torch.min(y))\n    if torch.max(y) > 1.0:\n        print(\"max value is \", torch.max(y))\n\n    global hann_window\n    dtype_device = str(y.dtype) + \"_\" + str(y.device)\n    wnsize_dtype_device = str(win_size) + \"_\" + dtype_device\n    if wnsize_dtype_device not in hann_window:\n        hann_window[wnsize_dtype_device] = torch.hann_window(win_size).to(dtype=y.dtype, device=y.device)\n\n    y = torch.nn.functional.pad(\n        y.unsqueeze(1), (int((n_fft - hop_size) / 2), int((n_fft - hop_size) / 2)), mode=\"reflect\"\n    )\n    y = y.squeeze(1)\n\n    spec = torch.stft(\n        y,\n        n_fft,\n        hop_length=hop_size,\n        win_length=win_size,\n        window=hann_window[wnsize_dtype_device],\n        center=center,\n        pad_mode=\"reflect\",\n        normalized=False,\n        onesided=True,\n        return_complex=False,\n    )\n\n    spec = torch.sqrt(spec.pow(2).sum(-1) + 1e-6)\n    return spec\n\n\ndef spec_to_mel_torch(spec, n_fft, num_mels, sampling_rate, fmin, fmax):\n    global mel_basis\n    dtype_device = str(spec.dtype) + \"_\" + str(spec.device)\n    fmax_dtype_device = str(fmax) + \"_\" + dtype_device\n    if fmax_dtype_device not in mel_basis:\n        mel = librosa_mel_fn(sr=sampling_rate, n_fft=n_fft, n_mels=num_mels, fmin=fmin, fmax=fmax)\n        mel_basis[fmax_dtype_device] = torch.from_numpy(mel).to(dtype=spec.dtype, device=spec.device)\n    spec = torch.matmul(mel_basis[fmax_dtype_device], spec)\n    spec = spectral_normalize_torch(spec)\n    return spec\n\n\ndef mel_spectrogram_torch(y, n_fft, num_mels, sampling_rate, hop_size, win_size, fmin, fmax, center=False):\n    if torch.min(y) < -1.0:\n        print(\"min value is \", torch.min(y))\n    if torch.max(y) > 1.0:\n        print(\"max value is \", torch.max(y))\n\n    global mel_basis, hann_window\n    dtype_device = str(y.dtype) + \"_\" + str(y.device)\n    fmax_dtype_device = str(fmax) + \"_\" + dtype_device\n    wnsize_dtype_device = str(win_size) + \"_\" + dtype_device\n    if fmax_dtype_device not in mel_basis:\n        mel = librosa_mel_fn(sr=sampling_rate, n_fft=n_fft, n_mels=num_mels, fmin=fmin, fmax=fmax)\n        mel_basis[fmax_dtype_device] = torch.from_numpy(mel).to(dtype=y.dtype, device=y.device)\n    if wnsize_dtype_device not in hann_window:\n        hann_window[wnsize_dtype_device] = torch.hann_window(win_size).to(dtype=y.dtype, device=y.device)\n\n    y = torch.nn.functional.pad(\n        y.unsqueeze(1), (int((n_fft - hop_size) / 2), int((n_fft - hop_size) / 2)), mode=\"reflect\"\n    )\n    y = y.squeeze(1)\n\n    spec = torch.stft(\n        y,\n        n_fft,\n        hop_length=hop_size,\n        win_length=win_size,\n        window=hann_window[wnsize_dtype_device],\n        center=center,\n        pad_mode=\"reflect\",\n        normalized=False,\n        onesided=True,\n        return_complex=False,\n    )\n\n    spec = torch.sqrt(spec.pow(2).sum(-1) + 1e-6)\n\n    spec = torch.matmul(mel_basis[fmax_dtype_device], spec)\n    spec = spectral_normalize_torch(spec)\n\n    return spec\n"
  },
  {
    "path": "TTS/vc/modules/freevc/modules.py",
    "content": "import copy\nimport math\n\nimport numpy as np\nimport scipy\nimport torch\nfrom torch import nn\nfrom torch.nn import AvgPool1d, Conv1d, Conv2d, ConvTranspose1d\nfrom torch.nn import functional as F\nfrom torch.nn.utils import remove_weight_norm, weight_norm\n\nimport TTS.vc.modules.freevc.commons as commons\nfrom TTS.vc.modules.freevc.commons import get_padding, init_weights\n\nLRELU_SLOPE = 0.1\n\n\nclass LayerNorm(nn.Module):\n    def __init__(self, channels, eps=1e-5):\n        super().__init__()\n        self.channels = channels\n        self.eps = eps\n\n        self.gamma = nn.Parameter(torch.ones(channels))\n        self.beta = nn.Parameter(torch.zeros(channels))\n\n    def forward(self, x):\n        x = x.transpose(1, -1)\n        x = F.layer_norm(x, (self.channels,), self.gamma, self.beta, self.eps)\n        return x.transpose(1, -1)\n\n\nclass ConvReluNorm(nn.Module):\n    def __init__(self, in_channels, hidden_channels, out_channels, kernel_size, n_layers, p_dropout):\n        super().__init__()\n        self.in_channels = in_channels\n        self.hidden_channels = hidden_channels\n        self.out_channels = out_channels\n        self.kernel_size = kernel_size\n        self.n_layers = n_layers\n        self.p_dropout = p_dropout\n        assert n_layers > 1, \"Number of layers should be larger than 0.\"\n\n        self.conv_layers = nn.ModuleList()\n        self.norm_layers = nn.ModuleList()\n        self.conv_layers.append(nn.Conv1d(in_channels, hidden_channels, kernel_size, padding=kernel_size // 2))\n        self.norm_layers.append(LayerNorm(hidden_channels))\n        self.relu_drop = nn.Sequential(nn.ReLU(), nn.Dropout(p_dropout))\n        for _ in range(n_layers - 1):\n            self.conv_layers.append(nn.Conv1d(hidden_channels, hidden_channels, kernel_size, padding=kernel_size // 2))\n            self.norm_layers.append(LayerNorm(hidden_channels))\n        self.proj = nn.Conv1d(hidden_channels, out_channels, 1)\n        self.proj.weight.data.zero_()\n        self.proj.bias.data.zero_()\n\n    def forward(self, x, x_mask):\n        x_org = x\n        for i in range(self.n_layers):\n            x = self.conv_layers[i](x * x_mask)\n            x = self.norm_layers[i](x)\n            x = self.relu_drop(x)\n        x = x_org + self.proj(x)\n        return x * x_mask\n\n\nclass DDSConv(nn.Module):\n    \"\"\"\n    Dialted and Depth-Separable Convolution\n    \"\"\"\n\n    def __init__(self, channels, kernel_size, n_layers, p_dropout=0.0):\n        super().__init__()\n        self.channels = channels\n        self.kernel_size = kernel_size\n        self.n_layers = n_layers\n        self.p_dropout = p_dropout\n\n        self.drop = nn.Dropout(p_dropout)\n        self.convs_sep = nn.ModuleList()\n        self.convs_1x1 = nn.ModuleList()\n        self.norms_1 = nn.ModuleList()\n        self.norms_2 = nn.ModuleList()\n        for i in range(n_layers):\n            dilation = kernel_size**i\n            padding = (kernel_size * dilation - dilation) // 2\n            self.convs_sep.append(\n                nn.Conv1d(channels, channels, kernel_size, groups=channels, dilation=dilation, padding=padding)\n            )\n            self.convs_1x1.append(nn.Conv1d(channels, channels, 1))\n            self.norms_1.append(LayerNorm(channels))\n            self.norms_2.append(LayerNorm(channels))\n\n    def forward(self, x, x_mask, g=None):\n        if g is not None:\n            x = x + g\n        for i in range(self.n_layers):\n            y = self.convs_sep[i](x * x_mask)\n            y = self.norms_1[i](y)\n            y = F.gelu(y)\n            y = self.convs_1x1[i](y)\n            y = self.norms_2[i](y)\n            y = F.gelu(y)\n            y = self.drop(y)\n            x = x + y\n        return x * x_mask\n\n\nclass WN(torch.nn.Module):\n    def __init__(self, hidden_channels, kernel_size, dilation_rate, n_layers, gin_channels=0, p_dropout=0):\n        super(WN, self).__init__()\n        assert kernel_size % 2 == 1\n        self.hidden_channels = hidden_channels\n        self.kernel_size = (kernel_size,)\n        self.dilation_rate = dilation_rate\n        self.n_layers = n_layers\n        self.gin_channels = gin_channels\n        self.p_dropout = p_dropout\n\n        self.in_layers = torch.nn.ModuleList()\n        self.res_skip_layers = torch.nn.ModuleList()\n        self.drop = nn.Dropout(p_dropout)\n\n        if gin_channels != 0:\n            cond_layer = torch.nn.Conv1d(gin_channels, 2 * hidden_channels * n_layers, 1)\n            self.cond_layer = torch.nn.utils.weight_norm(cond_layer, name=\"weight\")\n\n        for i in range(n_layers):\n            dilation = dilation_rate**i\n            padding = int((kernel_size * dilation - dilation) / 2)\n            in_layer = torch.nn.Conv1d(\n                hidden_channels, 2 * hidden_channels, kernel_size, dilation=dilation, padding=padding\n            )\n            in_layer = torch.nn.utils.weight_norm(in_layer, name=\"weight\")\n            self.in_layers.append(in_layer)\n\n            # last one is not necessary\n            if i < n_layers - 1:\n                res_skip_channels = 2 * hidden_channels\n            else:\n                res_skip_channels = hidden_channels\n\n            res_skip_layer = torch.nn.Conv1d(hidden_channels, res_skip_channels, 1)\n            res_skip_layer = torch.nn.utils.weight_norm(res_skip_layer, name=\"weight\")\n            self.res_skip_layers.append(res_skip_layer)\n\n    def forward(self, x, x_mask, g=None, **kwargs):\n        output = torch.zeros_like(x)\n        n_channels_tensor = torch.IntTensor([self.hidden_channels])\n\n        if g is not None:\n            g = self.cond_layer(g)\n\n        for i in range(self.n_layers):\n            x_in = self.in_layers[i](x)\n            if g is not None:\n                cond_offset = i * 2 * self.hidden_channels\n                g_l = g[:, cond_offset : cond_offset + 2 * self.hidden_channels, :]\n            else:\n                g_l = torch.zeros_like(x_in)\n\n            acts = commons.fused_add_tanh_sigmoid_multiply(x_in, g_l, n_channels_tensor)\n            acts = self.drop(acts)\n\n            res_skip_acts = self.res_skip_layers[i](acts)\n            if i < self.n_layers - 1:\n                res_acts = res_skip_acts[:, : self.hidden_channels, :]\n                x = (x + res_acts) * x_mask\n                output = output + res_skip_acts[:, self.hidden_channels :, :]\n            else:\n                output = output + res_skip_acts\n        return output * x_mask\n\n    def remove_weight_norm(self):\n        if self.gin_channels != 0:\n            torch.nn.utils.remove_weight_norm(self.cond_layer)\n        for l in self.in_layers:\n            torch.nn.utils.remove_weight_norm(l)\n        for l in self.res_skip_layers:\n            torch.nn.utils.remove_weight_norm(l)\n\n\nclass ResBlock1(torch.nn.Module):\n    def __init__(self, channels, kernel_size=3, dilation=(1, 3, 5)):\n        super(ResBlock1, self).__init__()\n        self.convs1 = nn.ModuleList(\n            [\n                weight_norm(\n                    Conv1d(\n                        channels,\n                        channels,\n                        kernel_size,\n                        1,\n                        dilation=dilation[0],\n                        padding=get_padding(kernel_size, dilation[0]),\n                    )\n                ),\n                weight_norm(\n                    Conv1d(\n                        channels,\n                        channels,\n                        kernel_size,\n                        1,\n                        dilation=dilation[1],\n                        padding=get_padding(kernel_size, dilation[1]),\n                    )\n                ),\n                weight_norm(\n                    Conv1d(\n                        channels,\n                        channels,\n                        kernel_size,\n                        1,\n                        dilation=dilation[2],\n                        padding=get_padding(kernel_size, dilation[2]),\n                    )\n                ),\n            ]\n        )\n        self.convs1.apply(init_weights)\n\n        self.convs2 = nn.ModuleList(\n            [\n                weight_norm(\n                    Conv1d(channels, channels, kernel_size, 1, dilation=1, padding=get_padding(kernel_size, 1))\n                ),\n                weight_norm(\n                    Conv1d(channels, channels, kernel_size, 1, dilation=1, padding=get_padding(kernel_size, 1))\n                ),\n                weight_norm(\n                    Conv1d(channels, channels, kernel_size, 1, dilation=1, padding=get_padding(kernel_size, 1))\n                ),\n            ]\n        )\n        self.convs2.apply(init_weights)\n\n    def forward(self, x, x_mask=None):\n        for c1, c2 in zip(self.convs1, self.convs2):\n            xt = F.leaky_relu(x, LRELU_SLOPE)\n            if x_mask is not None:\n                xt = xt * x_mask\n            xt = c1(xt)\n            xt = F.leaky_relu(xt, LRELU_SLOPE)\n            if x_mask is not None:\n                xt = xt * x_mask\n            xt = c2(xt)\n            x = xt + x\n        if x_mask is not None:\n            x = x * x_mask\n        return x\n\n    def remove_weight_norm(self):\n        for l in self.convs1:\n            remove_weight_norm(l)\n        for l in self.convs2:\n            remove_weight_norm(l)\n\n\nclass ResBlock2(torch.nn.Module):\n    def __init__(self, channels, kernel_size=3, dilation=(1, 3)):\n        super(ResBlock2, self).__init__()\n        self.convs = nn.ModuleList(\n            [\n                weight_norm(\n                    Conv1d(\n                        channels,\n                        channels,\n                        kernel_size,\n                        1,\n                        dilation=dilation[0],\n                        padding=get_padding(kernel_size, dilation[0]),\n                    )\n                ),\n                weight_norm(\n                    Conv1d(\n                        channels,\n                        channels,\n                        kernel_size,\n                        1,\n                        dilation=dilation[1],\n                        padding=get_padding(kernel_size, dilation[1]),\n                    )\n                ),\n            ]\n        )\n        self.convs.apply(init_weights)\n\n    def forward(self, x, x_mask=None):\n        for c in self.convs:\n            xt = F.leaky_relu(x, LRELU_SLOPE)\n            if x_mask is not None:\n                xt = xt * x_mask\n            xt = c(xt)\n            x = xt + x\n        if x_mask is not None:\n            x = x * x_mask\n        return x\n\n    def remove_weight_norm(self):\n        for l in self.convs:\n            remove_weight_norm(l)\n\n\nclass Log(nn.Module):\n    def forward(self, x, x_mask, reverse=False, **kwargs):\n        if not reverse:\n            y = torch.log(torch.clamp_min(x, 1e-5)) * x_mask\n            logdet = torch.sum(-y, [1, 2])\n            return y, logdet\n        else:\n            x = torch.exp(x) * x_mask\n            return x\n\n\nclass Flip(nn.Module):\n    def forward(self, x, *args, reverse=False, **kwargs):\n        x = torch.flip(x, [1])\n        if not reverse:\n            logdet = torch.zeros(x.size(0)).to(dtype=x.dtype, device=x.device)\n            return x, logdet\n        else:\n            return x\n\n\nclass ElementwiseAffine(nn.Module):\n    def __init__(self, channels):\n        super().__init__()\n        self.channels = channels\n        self.m = nn.Parameter(torch.zeros(channels, 1))\n        self.logs = nn.Parameter(torch.zeros(channels, 1))\n\n    def forward(self, x, x_mask, reverse=False, **kwargs):\n        if not reverse:\n            y = self.m + torch.exp(self.logs) * x\n            y = y * x_mask\n            logdet = torch.sum(self.logs * x_mask, [1, 2])\n            return y, logdet\n        else:\n            x = (x - self.m) * torch.exp(-self.logs) * x_mask\n            return x\n\n\nclass ResidualCouplingLayer(nn.Module):\n    def __init__(\n        self,\n        channels,\n        hidden_channels,\n        kernel_size,\n        dilation_rate,\n        n_layers,\n        p_dropout=0,\n        gin_channels=0,\n        mean_only=False,\n    ):\n        assert channels % 2 == 0, \"channels should be divisible by 2\"\n        super().__init__()\n        self.channels = channels\n        self.hidden_channels = hidden_channels\n        self.kernel_size = kernel_size\n        self.dilation_rate = dilation_rate\n        self.n_layers = n_layers\n        self.half_channels = channels // 2\n        self.mean_only = mean_only\n\n        self.pre = nn.Conv1d(self.half_channels, hidden_channels, 1)\n        self.enc = WN(\n            hidden_channels, kernel_size, dilation_rate, n_layers, p_dropout=p_dropout, gin_channels=gin_channels\n        )\n        self.post = nn.Conv1d(hidden_channels, self.half_channels * (2 - mean_only), 1)\n        self.post.weight.data.zero_()\n        self.post.bias.data.zero_()\n\n    def forward(self, x, x_mask, g=None, reverse=False):\n        x0, x1 = torch.split(x, [self.half_channels] * 2, 1)\n        h = self.pre(x0) * x_mask\n        h = self.enc(h, x_mask, g=g)\n        stats = self.post(h) * x_mask\n        if not self.mean_only:\n            m, logs = torch.split(stats, [self.half_channels] * 2, 1)\n        else:\n            m = stats\n            logs = torch.zeros_like(m)\n\n        if not reverse:\n            x1 = m + x1 * torch.exp(logs) * x_mask\n            x = torch.cat([x0, x1], 1)\n            logdet = torch.sum(logs, [1, 2])\n            return x, logdet\n        else:\n            x1 = (x1 - m) * torch.exp(-logs) * x_mask\n            x = torch.cat([x0, x1], 1)\n            return x\n"
  },
  {
    "path": "TTS/vc/modules/freevc/speaker_encoder/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/vc/modules/freevc/speaker_encoder/audio.py",
    "content": "import struct\nfrom pathlib import Path\nfrom typing import Optional, Union\n\n# import webrtcvad\nimport librosa\nimport numpy as np\nfrom scipy.ndimage.morphology import binary_dilation\n\nfrom TTS.vc.modules.freevc.speaker_encoder.hparams import *\n\nint16_max = (2**15) - 1\n\n\ndef preprocess_wav(fpath_or_wav: Union[str, Path, np.ndarray], source_sr: Optional[int] = None):\n    \"\"\"\n    Applies the preprocessing operations used in training the Speaker Encoder to a waveform\n    either on disk or in memory. The waveform will be resampled to match the data hyperparameters.\n\n    :param fpath_or_wav: either a filepath to an audio file (many extensions are supported, not\n    just .wav), either the waveform as a numpy array of floats.\n    :param source_sr: if passing an audio waveform, the sampling rate of the waveform before\n    preprocessing. After preprocessing, the waveform's sampling rate will match the data\n    hyperparameters. If passing a filepath, the sampling rate will be automatically detected and\n    this argument will be ignored.\n    \"\"\"\n    # Load the wav from disk if needed\n    if isinstance(fpath_or_wav, str) or isinstance(fpath_or_wav, Path):\n        wav, source_sr = librosa.load(fpath_or_wav, sr=None)\n    else:\n        wav = fpath_or_wav\n\n    # Resample the wav if needed\n    if source_sr is not None and source_sr != sampling_rate:\n        wav = librosa.resample(wav, source_sr, sampling_rate)\n\n    # Apply the preprocessing: normalize volume and shorten long silences\n    wav = normalize_volume(wav, audio_norm_target_dBFS, increase_only=True)\n    wav = trim_long_silences(wav)\n\n    return wav\n\n\ndef wav_to_mel_spectrogram(wav):\n    \"\"\"\n    Derives a mel spectrogram ready to be used by the encoder from a preprocessed audio waveform.\n    Note: this not a log-mel spectrogram.\n    \"\"\"\n    frames = librosa.feature.melspectrogram(\n        y=wav,\n        sr=sampling_rate,\n        n_fft=int(sampling_rate * mel_window_length / 1000),\n        hop_length=int(sampling_rate * mel_window_step / 1000),\n        n_mels=mel_n_channels,\n    )\n    return frames.astype(np.float32).T\n\n\ndef normalize_volume(wav, target_dBFS, increase_only=False, decrease_only=False):\n    if increase_only and decrease_only:\n        raise ValueError(\"Both increase only and decrease only are set\")\n    dBFS_change = target_dBFS - 10 * np.log10(np.mean(wav**2))\n    if (dBFS_change < 0 and increase_only) or (dBFS_change > 0 and decrease_only):\n        return wav\n    return wav * (10 ** (dBFS_change / 20))\n"
  },
  {
    "path": "TTS/vc/modules/freevc/speaker_encoder/hparams.py",
    "content": "## Mel-filterbank\nmel_window_length = 25  # In milliseconds\nmel_window_step = 10  # In milliseconds\nmel_n_channels = 40\n\n\n## Audio\nsampling_rate = 16000\n# Number of spectrogram frames in a partial utterance\npartials_n_frames = 160  # 1600 ms\n\n\n## Voice Activation Detection\n# Window size of the VAD. Must be either 10, 20 or 30 milliseconds.\n# This sets the granularity of the VAD. Should not need to be changed.\nvad_window_length = 30  # In milliseconds\n# Number of frames to average together when performing the moving average smoothing.\n# The larger this value, the larger the VAD variations must be to not get smoothed out.\nvad_moving_average_width = 8\n# Maximum number of consecutive silent frames a segment can have.\nvad_max_silence_length = 6\n\n\n## Audio volume normalization\naudio_norm_target_dBFS = -30\n\n\n## Model parameters\nmodel_hidden_size = 256\nmodel_embedding_size = 256\nmodel_num_layers = 3\n"
  },
  {
    "path": "TTS/vc/modules/freevc/speaker_encoder/speaker_encoder.py",
    "content": "from pathlib import Path\nfrom time import perf_counter as timer\nfrom typing import List, Union\n\nimport numpy as np\nimport torch\nfrom torch import nn\n\nfrom TTS.utils.io import load_fsspec\nfrom TTS.vc.modules.freevc.speaker_encoder import audio\nfrom TTS.vc.modules.freevc.speaker_encoder.hparams import *\n\n\nclass SpeakerEncoder(nn.Module):\n    def __init__(self, weights_fpath, device: Union[str, torch.device] = None, verbose=True):\n        \"\"\"\n        :param device: either a torch device or the name of a torch device (e.g. \"cpu\", \"cuda\").\n        If None, defaults to cuda if it is available on your machine, otherwise the model will\n        run on cpu. Outputs are always returned on the cpu, as numpy arrays.\n        \"\"\"\n        super().__init__()\n\n        # Define the network\n        self.lstm = nn.LSTM(mel_n_channels, model_hidden_size, model_num_layers, batch_first=True)\n        self.linear = nn.Linear(model_hidden_size, model_embedding_size)\n        self.relu = nn.ReLU()\n\n        # Get the target device\n        if device is None:\n            device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n        elif isinstance(device, str):\n            device = torch.device(device)\n        self.device = device\n\n        # Load the pretrained model'speaker weights\n        # weights_fpath = Path(__file__).resolve().parent.joinpath(\"pretrained.pt\")\n        # if not weights_fpath.exists():\n        #     raise Exception(\"Couldn't find the voice encoder pretrained model at %s.\" %\n        #                     weights_fpath)\n\n        start = timer()\n        checkpoint = load_fsspec(weights_fpath, map_location=\"cpu\")\n\n        self.load_state_dict(checkpoint[\"model_state\"], strict=False)\n        self.to(device)\n\n        if verbose:\n            print(\"Loaded the voice encoder model on %s in %.2f seconds.\" % (device.type, timer() - start))\n\n    def forward(self, mels: torch.FloatTensor):\n        \"\"\"\n        Computes the embeddings of a batch of utterance spectrograms.\n        :param mels: a batch of mel spectrograms of same duration as a float32 tensor of shape\n        (batch_size, n_frames, n_channels)\n        :return: the embeddings as a float 32 tensor of shape (batch_size, embedding_size).\n        Embeddings are positive and L2-normed, thus they lay in the range [0, 1].\n        \"\"\"\n        # Pass the input through the LSTM layers and retrieve the final hidden state of the last\n        # layer. Apply a cutoff to 0 for negative values and L2 normalize the embeddings.\n        _, (hidden, _) = self.lstm(mels)\n        embeds_raw = self.relu(self.linear(hidden[-1]))\n        return embeds_raw / torch.norm(embeds_raw, dim=1, keepdim=True)\n\n    @staticmethod\n    def compute_partial_slices(n_samples: int, rate, min_coverage):\n        \"\"\"\n        Computes where to split an utterance waveform and its corresponding mel spectrogram to\n        obtain partial utterances of <partials_n_frames> each. Both the waveform and the\n        mel spectrogram slices are returned, so as to make each partial utterance waveform\n        correspond to its spectrogram.\n\n        The returned ranges may be indexing further than the length of the waveform. It is\n        recommended that you pad the waveform with zeros up to wav_slices[-1].stop.\n\n        :param n_samples: the number of samples in the waveform\n        :param rate: how many partial utterances should occur per second. Partial utterances must\n        cover the span of the entire utterance, thus the rate should not be lower than the inverse\n        of the duration of a partial utterance. By default, partial utterances are 1.6s long and\n        the minimum rate is thus 0.625.\n        :param min_coverage: when reaching the last partial utterance, it may or may not have\n        enough frames. If at least <min_pad_coverage> of <partials_n_frames> are present,\n        then the last partial utterance will be considered by zero-padding the audio. Otherwise,\n        it will be discarded. If there aren't enough frames for one partial utterance,\n        this parameter is ignored so that the function always returns at least one slice.\n        :return: the waveform slices and mel spectrogram slices as lists of array slices. Index\n        respectively the waveform and the mel spectrogram with these slices to obtain the partial\n        utterances.\n        \"\"\"\n        assert 0 < min_coverage <= 1\n\n        # Compute how many frames separate two partial utterances\n        samples_per_frame = int((sampling_rate * mel_window_step / 1000))\n        n_frames = int(np.ceil((n_samples + 1) / samples_per_frame))\n        frame_step = int(np.round((sampling_rate / rate) / samples_per_frame))\n        assert 0 < frame_step, \"The rate is too high\"\n        assert frame_step <= partials_n_frames, \"The rate is too low, it should be %f at least\" % (\n            sampling_rate / (samples_per_frame * partials_n_frames)\n        )\n\n        # Compute the slices\n        wav_slices, mel_slices = [], []\n        steps = max(1, n_frames - partials_n_frames + frame_step + 1)\n        for i in range(0, steps, frame_step):\n            mel_range = np.array([i, i + partials_n_frames])\n            wav_range = mel_range * samples_per_frame\n            mel_slices.append(slice(*mel_range))\n            wav_slices.append(slice(*wav_range))\n\n        # Evaluate whether extra padding is warranted or not\n        last_wav_range = wav_slices[-1]\n        coverage = (n_samples - last_wav_range.start) / (last_wav_range.stop - last_wav_range.start)\n        if coverage < min_coverage and len(mel_slices) > 1:\n            mel_slices = mel_slices[:-1]\n            wav_slices = wav_slices[:-1]\n\n        return wav_slices, mel_slices\n\n    def embed_utterance(self, wav: np.ndarray, return_partials=False, rate=1.3, min_coverage=0.75):\n        \"\"\"\n        Computes an embedding for a single utterance. The utterance is divided in partial\n        utterances and an embedding is computed for each. The complete utterance embedding is the\n        L2-normed average embedding of the partial utterances.\n\n        TODO: independent batched version of this function\n\n        :param wav: a preprocessed utterance waveform as a numpy array of float32\n        :param return_partials: if True, the partial embeddings will also be returned along with\n        the wav slices corresponding to each partial utterance.\n        :param rate: how many partial utterances should occur per second. Partial utterances must\n        cover the span of the entire utterance, thus the rate should not be lower than the inverse\n        of the duration of a partial utterance. By default, partial utterances are 1.6s long and\n        the minimum rate is thus 0.625.\n        :param min_coverage: when reaching the last partial utterance, it may or may not have\n        enough frames. If at least <min_pad_coverage> of <partials_n_frames> are present,\n        then the last partial utterance will be considered by zero-padding the audio. Otherwise,\n        it will be discarded. If there aren't enough frames for one partial utterance,\n        this parameter is ignored so that the function always returns at least one slice.\n        :return: the embedding as a numpy array of float32 of shape (model_embedding_size,). If\n        <return_partials> is True, the partial utterances as a numpy array of float32 of shape\n        (n_partials, model_embedding_size) and the wav partials as a list of slices will also be\n        returned.\n        \"\"\"\n        # Compute where to split the utterance into partials and pad the waveform with zeros if\n        # the partial utterances cover a larger range.\n        wav_slices, mel_slices = self.compute_partial_slices(len(wav), rate, min_coverage)\n        max_wave_length = wav_slices[-1].stop\n        if max_wave_length >= len(wav):\n            wav = np.pad(wav, (0, max_wave_length - len(wav)), \"constant\")\n\n        # Split the utterance into partials and forward them through the model\n        mel = audio.wav_to_mel_spectrogram(wav)\n        mels = np.array([mel[s] for s in mel_slices])\n        with torch.no_grad():\n            mels = torch.from_numpy(mels).to(self.device)\n            partial_embeds = self(mels).cpu().numpy()\n\n        # Compute the utterance embedding from the partial embeddings\n        raw_embed = np.mean(partial_embeds, axis=0)\n        embed = raw_embed / np.linalg.norm(raw_embed, 2)\n\n        if return_partials:\n            return embed, partial_embeds, wav_slices\n        return embed\n\n    def embed_speaker(self, wavs: List[np.ndarray], **kwargs):\n        \"\"\"\n        Compute the embedding of a collection of wavs (presumably from the same speaker) by\n        averaging their embedding and L2-normalizing it.\n\n        :param wavs: list of wavs a numpy arrays of float32.\n        :param kwargs: extra arguments to embed_utterance()\n        :return: the embedding as a numpy array of float32 of shape (model_embedding_size,).\n        \"\"\"\n        raw_embed = np.mean([self.embed_utterance(wav, return_partials=False, **kwargs) for wav in wavs], axis=0)\n        return raw_embed / np.linalg.norm(raw_embed, 2)\n"
  },
  {
    "path": "TTS/vc/modules/freevc/wavlm/__init__.py",
    "content": "import os\nimport urllib.request\n\nimport torch\n\nfrom TTS.utils.generic_utils import get_user_data_dir\nfrom TTS.vc.modules.freevc.wavlm.wavlm import WavLM, WavLMConfig\n\nmodel_uri = \"https://github.com/coqui-ai/TTS/releases/download/v0.13.0_models/WavLM-Large.pt\"\n\n\ndef get_wavlm(device=\"cpu\"):\n    \"\"\"Download the model and return the model object.\"\"\"\n\n    output_path = get_user_data_dir(\"tts\")\n\n    output_path = os.path.join(output_path, \"wavlm\")\n    if not os.path.exists(output_path):\n        os.makedirs(output_path)\n\n    output_path = os.path.join(output_path, \"WavLM-Large.pt\")\n    if not os.path.exists(output_path):\n        print(f\" > Downloading WavLM model to {output_path} ...\")\n        urllib.request.urlretrieve(model_uri, output_path)\n\n    checkpoint = torch.load(output_path, map_location=torch.device(device))\n    cfg = WavLMConfig(checkpoint[\"cfg\"])\n    wavlm = WavLM(cfg).to(device)\n    wavlm.load_state_dict(checkpoint[\"model\"])\n    wavlm.eval()\n    return wavlm\n\n\nif __name__ == \"__main__\":\n    wavlm = get_wavlm()\n"
  },
  {
    "path": "TTS/vc/modules/freevc/wavlm/config.json",
    "content": "{\n    \"_name_or_path\": \"./wavlm-large/\",\n    \"activation_dropout\": 0.0,\n    \"adapter_kernel_size\": 3,\n    \"adapter_stride\": 2,\n    \"add_adapter\": false,\n    \"apply_spec_augment\": true,\n    \"architectures\": [\n      \"WavLMModel\"\n    ],\n    \"attention_dropout\": 0.1,\n    \"bos_token_id\": 1,\n    \"classifier_proj_size\": 256,\n    \"codevector_dim\": 768,\n    \"contrastive_logits_temperature\": 0.1,\n    \"conv_bias\": false,\n    \"conv_dim\": [\n      512,\n      512,\n      512,\n      512,\n      512,\n      512,\n      512\n    ],\n    \"conv_kernel\": [\n      10,\n      3,\n      3,\n      3,\n      3,\n      2,\n      2\n    ],\n    \"conv_stride\": [\n      5,\n      2,\n      2,\n      2,\n      2,\n      2,\n      2\n    ],\n    \"ctc_loss_reduction\": \"sum\",\n    \"ctc_zero_infinity\": false,\n    \"diversity_loss_weight\": 0.1,\n    \"do_stable_layer_norm\": true,\n    \"eos_token_id\": 2,\n    \"feat_extract_activation\": \"gelu\",\n    \"feat_extract_dropout\": 0.0,\n    \"feat_extract_norm\": \"layer\",\n    \"feat_proj_dropout\": 0.1,\n    \"feat_quantizer_dropout\": 0.0,\n    \"final_dropout\": 0.0,\n    \"gradient_checkpointing\": false,\n    \"hidden_act\": \"gelu\",\n    \"hidden_dropout\": 0.1,\n    \"hidden_size\": 1024,\n    \"initializer_range\": 0.02,\n    \"intermediate_size\": 4096,\n    \"layer_norm_eps\": 1e-05,\n    \"layerdrop\": 0.1,\n    \"mask_channel_length\": 10,\n    \"mask_channel_min_space\": 1,\n    \"mask_channel_other\": 0.0,\n    \"mask_channel_prob\": 0.0,\n    \"mask_channel_selection\": \"static\",\n    \"mask_feature_length\": 10,\n    \"mask_feature_min_masks\": 0,\n    \"mask_feature_prob\": 0.0,\n    \"mask_time_length\": 10,\n    \"mask_time_min_masks\": 2,\n    \"mask_time_min_space\": 1,\n    \"mask_time_other\": 0.0,\n    \"mask_time_prob\": 0.075,\n    \"mask_time_selection\": \"static\",\n    \"max_bucket_distance\": 800,\n    \"model_type\": \"wavlm\",\n    \"num_adapter_layers\": 3,\n    \"num_attention_heads\": 16,\n    \"num_buckets\": 320,\n    \"num_codevector_groups\": 2,\n    \"num_codevectors_per_group\": 320,\n    \"num_conv_pos_embedding_groups\": 16,\n    \"num_conv_pos_embeddings\": 128,\n    \"num_ctc_classes\": 80,\n    \"num_feat_extract_layers\": 7,\n    \"num_hidden_layers\": 24,\n    \"num_negatives\": 100,\n    \"output_hidden_size\": 1024,\n    \"pad_token_id\": 0,\n    \"proj_codevector_dim\": 768,\n    \"replace_prob\": 0.5,\n    \"tokenizer_class\": \"Wav2Vec2CTCTokenizer\",\n    \"torch_dtype\": \"float32\",\n    \"transformers_version\": \"4.15.0.dev0\",\n    \"use_weighted_layer_sum\": false,\n    \"vocab_size\": 32\n  }"
  },
  {
    "path": "TTS/vc/modules/freevc/wavlm/modules.py",
    "content": "# --------------------------------------------------------\n# WavLM: Large-Scale Self-Supervised  Pre-training  for Full Stack Speech Processing (https://arxiv.org/abs/2110.13900.pdf)\n# Github source: https://github.com/microsoft/unilm/tree/master/wavlm\n# Copyright (c) 2021 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Based on fairseq code bases\n# https://github.com/pytorch/fairseq\n# --------------------------------------------------------\n\nimport math\nimport warnings\nfrom typing import Dict, Optional, Tuple\n\nimport torch\nimport torch.nn.functional as F\nfrom torch import Tensor, nn\nfrom torch.nn import Parameter\n\n\nclass TransposeLast(nn.Module):\n    def __init__(self, deconstruct_idx=None):\n        super().__init__()\n        self.deconstruct_idx = deconstruct_idx\n\n    def forward(self, x):\n        if self.deconstruct_idx is not None:\n            x = x[self.deconstruct_idx]\n        return x.transpose(-2, -1)\n\n\nclass Fp32LayerNorm(nn.LayerNorm):\n    def __init__(self, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n\n    def forward(self, input):\n        output = F.layer_norm(\n            input.float(),\n            self.normalized_shape,\n            self.weight.float() if self.weight is not None else None,\n            self.bias.float() if self.bias is not None else None,\n            self.eps,\n        )\n        return output.type_as(input)\n\n\nclass Fp32GroupNorm(nn.GroupNorm):\n    def __init__(self, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n\n    def forward(self, input):\n        output = F.group_norm(\n            input.float(),\n            self.num_groups,\n            self.weight.float() if self.weight is not None else None,\n            self.bias.float() if self.bias is not None else None,\n            self.eps,\n        )\n        return output.type_as(input)\n\n\nclass GradMultiply(torch.autograd.Function):\n    @staticmethod\n    def forward(ctx, x, scale):\n        ctx.scale = scale\n        res = x.new(x)\n        return res\n\n    @staticmethod\n    def backward(ctx, grad):\n        return grad * ctx.scale, None\n\n\nclass SamePad(nn.Module):\n    def __init__(self, kernel_size, causal=False):\n        super().__init__()\n        if causal:\n            self.remove = kernel_size - 1\n        else:\n            self.remove = 1 if kernel_size % 2 == 0 else 0\n\n    def forward(self, x):\n        if self.remove > 0:\n            x = x[:, :, : -self.remove]\n        return x\n\n\nclass Swish(nn.Module):\n    \"\"\"Swish function\"\"\"\n\n    def __init__(self):\n        \"\"\"Construct an MultiHeadedAttention object.\"\"\"\n        super(Swish, self).__init__()\n        self.act = torch.nn.Sigmoid()\n\n    def forward(self, x):\n        return x * self.act(x)\n\n\nclass GLU_Linear(nn.Module):\n    def __init__(self, input_dim, output_dim, glu_type=\"sigmoid\", bias_in_glu=True):\n        super(GLU_Linear, self).__init__()\n\n        self.glu_type = glu_type\n        self.output_dim = output_dim\n\n        if glu_type == \"sigmoid\":\n            self.glu_act = torch.nn.Sigmoid()\n        elif glu_type == \"swish\":\n            self.glu_act = Swish()\n        elif glu_type == \"relu\":\n            self.glu_act = torch.nn.ReLU()\n        elif glu_type == \"gelu\":\n            self.glu_act = torch.nn.GELU()\n\n        if bias_in_glu:\n            self.linear = nn.Linear(input_dim, output_dim * 2, True)\n        else:\n            self.linear = nn.Linear(input_dim, output_dim * 2, False)\n\n    def forward(self, x):\n        # to be consistent with GLU_Linear, we assume the input always has the #channel (#dim) in the last dimension of the tensor, so need to switch the dimension first for 1D-Conv case\n        x = self.linear(x)\n\n        if self.glu_type == \"bilinear\":\n            x = x[:, :, 0 : self.output_dim] * x[:, :, self.output_dim : self.output_dim * 2]\n        else:\n            x = x[:, :, 0 : self.output_dim] * self.glu_act(x[:, :, self.output_dim : self.output_dim * 2])\n\n        return x\n\n\ndef gelu_accurate(x):\n    if not hasattr(gelu_accurate, \"_a\"):\n        gelu_accurate._a = math.sqrt(2 / math.pi)\n    return 0.5 * x * (1 + torch.tanh(gelu_accurate._a * (x + 0.044715 * torch.pow(x, 3))))\n\n\ndef gelu(x: torch.Tensor) -> torch.Tensor:\n    return torch.nn.functional.gelu(x.float()).type_as(x)\n\n\ndef get_activation_fn(activation: str):\n    \"\"\"Returns the activation function corresponding to `activation`\"\"\"\n\n    if activation == \"relu\":\n        return F.relu\n    elif activation == \"gelu\":\n        return gelu\n    elif activation == \"gelu_fast\":\n        warnings.warn(\"--activation-fn=gelu_fast has been renamed to gelu_accurate\")\n        return gelu_accurate\n    elif activation == \"gelu_accurate\":\n        return gelu_accurate\n    elif activation == \"tanh\":\n        return torch.tanh\n    elif activation == \"linear\":\n        return lambda x: x\n    elif activation == \"glu\":\n        return lambda x: x\n    else:\n        raise RuntimeError(\"--activation-fn {} not supported\".format(activation))\n\n\ndef init_bert_params(module):\n    \"\"\"\n    Initialize the weights specific to the BERT Model.\n    This overrides the default initializations depending on the specified arguments.\n        1. If normal_init_linear_weights is set then weights of linear\n           layer will be initialized using the normal distribution and\n           bais will be set to the specified value.\n        2. If normal_init_embed_weights is set then weights of embedding\n           layer will be initialized using the normal distribution.\n        3. If normal_init_proj_weights is set then weights of\n           in_project_weight for MultiHeadAttention initialized using\n           the normal distribution (to be validated).\n    \"\"\"\n\n    def normal_(data):\n        # with FSDP, module params will be on CUDA, so we cast them back to CPU\n        # so that the RNG is consistent with and without FSDP\n        data.copy_(data.cpu().normal_(mean=0.0, std=0.02).to(data.device))\n\n    if isinstance(module, nn.Linear):\n        normal_(module.weight.data)\n        if module.bias is not None:\n            module.bias.data.zero_()\n    if isinstance(module, nn.Embedding):\n        normal_(module.weight.data)\n        if module.padding_idx is not None:\n            module.weight.data[module.padding_idx].zero_()\n    if isinstance(module, MultiheadAttention):\n        normal_(module.q_proj.weight.data)\n        normal_(module.k_proj.weight.data)\n        normal_(module.v_proj.weight.data)\n\n\ndef quant_noise(module, p, block_size):\n    \"\"\"\n    Wraps modules and applies quantization noise to the weights for\n    subsequent quantization with Iterative Product Quantization as\n    described in \"Training with Quantization Noise for Extreme Model Compression\"\n\n    Args:\n        - module: nn.Module\n        - p: amount of Quantization Noise\n        - block_size: size of the blocks for subsequent quantization with iPQ\n\n    Remarks:\n        - Module weights must have the right sizes wrt the block size\n        - Only Linear, Embedding and Conv2d modules are supported for the moment\n        - For more detail on how to quantize by blocks with convolutional weights,\n          see \"And the Bit Goes Down: Revisiting the Quantization of Neural Networks\"\n        - We implement the simplest form of noise here as stated in the paper\n          which consists in randomly dropping blocks\n    \"\"\"\n\n    # if no quantization noise, don't register hook\n    if p <= 0:\n        return module\n\n    # supported modules\n    assert isinstance(module, (nn.Linear, nn.Embedding, nn.Conv2d))\n\n    # test whether module.weight has the right sizes wrt block_size\n    is_conv = module.weight.ndim == 4\n\n    # 2D matrix\n    if not is_conv:\n        assert module.weight.size(1) % block_size == 0, \"Input features must be a multiple of block sizes\"\n\n    # 4D matrix\n    else:\n        # 1x1 convolutions\n        if module.kernel_size == (1, 1):\n            assert module.in_channels % block_size == 0, \"Input channels must be a multiple of block sizes\"\n        # regular convolutions\n        else:\n            k = module.kernel_size[0] * module.kernel_size[1]\n            assert k % block_size == 0, \"Kernel size must be a multiple of block size\"\n\n    def _forward_pre_hook(mod, input):\n        # no noise for evaluation\n        if mod.training:\n            if not is_conv:\n                # gather weight and sizes\n                weight = mod.weight\n                in_features = weight.size(1)\n                out_features = weight.size(0)\n\n                # split weight matrix into blocks and randomly drop selected blocks\n                mask = torch.zeros(in_features // block_size * out_features, device=weight.device)\n                mask.bernoulli_(p)\n                mask = mask.repeat_interleave(block_size, -1).view(-1, in_features)\n\n            else:\n                # gather weight and sizes\n                weight = mod.weight\n                in_channels = mod.in_channels\n                out_channels = mod.out_channels\n\n                # split weight matrix into blocks and randomly drop selected blocks\n                if mod.kernel_size == (1, 1):\n                    mask = torch.zeros(\n                        int(in_channels // block_size * out_channels),\n                        device=weight.device,\n                    )\n                    mask.bernoulli_(p)\n                    mask = mask.repeat_interleave(block_size, -1).view(-1, in_channels)\n                else:\n                    mask = torch.zeros(weight.size(0), weight.size(1), device=weight.device)\n                    mask.bernoulli_(p)\n                    mask = mask.unsqueeze(2).unsqueeze(3).repeat(1, 1, mod.kernel_size[0], mod.kernel_size[1])\n\n            # scale weights and apply mask\n            mask = mask.to(torch.bool)  # x.bool() is not currently supported in TorchScript\n            s = 1 / (1 - p)\n            mod.weight.data = s * weight.masked_fill(mask, 0)\n\n    module.register_forward_pre_hook(_forward_pre_hook)\n    return module\n\n\nclass MultiheadAttention(nn.Module):\n    \"\"\"Multi-headed attention.\n\n    See \"Attention Is All You Need\" for more details.\n    \"\"\"\n\n    def __init__(\n        self,\n        embed_dim,\n        num_heads,\n        kdim=None,\n        vdim=None,\n        dropout=0.0,\n        bias=True,\n        add_bias_kv=False,\n        add_zero_attn=False,\n        self_attention=False,\n        encoder_decoder_attention=False,\n        q_noise=0.0,\n        qn_block_size=8,\n        has_relative_attention_bias=False,\n        num_buckets=32,\n        max_distance=128,\n        gru_rel_pos=False,\n        rescale_init=False,\n    ):\n        super().__init__()\n        self.embed_dim = embed_dim\n        self.kdim = kdim if kdim is not None else embed_dim\n        self.vdim = vdim if vdim is not None else embed_dim\n        self.qkv_same_dim = self.kdim == embed_dim and self.vdim == embed_dim\n\n        self.num_heads = num_heads\n        self.dropout_module = nn.Dropout(dropout)\n\n        self.has_relative_attention_bias = has_relative_attention_bias\n        self.num_buckets = num_buckets\n        self.max_distance = max_distance\n        if self.has_relative_attention_bias:\n            self.relative_attention_bias = nn.Embedding(num_buckets, num_heads)\n\n        self.head_dim = embed_dim // num_heads\n        self.q_head_dim = self.head_dim\n        self.k_head_dim = self.head_dim\n        assert self.head_dim * num_heads == self.embed_dim, \"embed_dim must be divisible by num_heads\"\n        self.scaling = self.head_dim**-0.5\n\n        self.self_attention = self_attention\n        self.encoder_decoder_attention = encoder_decoder_attention\n\n        assert not self.self_attention or self.qkv_same_dim, (\n            \"Self-attention requires query, key and \" \"value to be of the same size\"\n        )\n\n        k_bias = True\n        if rescale_init:\n            k_bias = False\n\n        k_embed_dim = embed_dim\n        q_embed_dim = embed_dim\n\n        self.k_proj = quant_noise(nn.Linear(self.kdim, k_embed_dim, bias=k_bias), q_noise, qn_block_size)\n        self.v_proj = quant_noise(nn.Linear(self.vdim, embed_dim, bias=bias), q_noise, qn_block_size)\n        self.q_proj = quant_noise(nn.Linear(embed_dim, q_embed_dim, bias=bias), q_noise, qn_block_size)\n\n        self.out_proj = quant_noise(nn.Linear(embed_dim, embed_dim, bias=bias), q_noise, qn_block_size)\n\n        if add_bias_kv:\n            self.bias_k = Parameter(torch.Tensor(1, 1, embed_dim))\n            self.bias_v = Parameter(torch.Tensor(1, 1, embed_dim))\n        else:\n            self.bias_k = self.bias_v = None\n\n        self.add_zero_attn = add_zero_attn\n\n        self.gru_rel_pos = gru_rel_pos\n        if self.gru_rel_pos:\n            self.grep_linear = nn.Linear(self.q_head_dim, 8)\n            self.grep_a = nn.Parameter(torch.ones(1, num_heads, 1, 1))\n\n        self.reset_parameters()\n\n    def reset_parameters(self):\n        if self.qkv_same_dim:\n            # Empirically observed the convergence to be much better with\n            # the scaled initialization\n            nn.init.xavier_uniform_(self.k_proj.weight, gain=1 / math.sqrt(2))\n            nn.init.xavier_uniform_(self.v_proj.weight, gain=1 / math.sqrt(2))\n            nn.init.xavier_uniform_(self.q_proj.weight, gain=1 / math.sqrt(2))\n        else:\n            nn.init.xavier_uniform_(self.k_proj.weight)\n            nn.init.xavier_uniform_(self.v_proj.weight)\n            nn.init.xavier_uniform_(self.q_proj.weight)\n\n        nn.init.xavier_uniform_(self.out_proj.weight)\n        if self.out_proj.bias is not None:\n            nn.init.constant_(self.out_proj.bias, 0.0)\n        if self.bias_k is not None:\n            nn.init.xavier_normal_(self.bias_k)\n        if self.bias_v is not None:\n            nn.init.xavier_normal_(self.bias_v)\n        if self.has_relative_attention_bias:\n            nn.init.xavier_normal_(self.relative_attention_bias.weight)\n\n    def _relative_positions_bucket(self, relative_positions, bidirectional=True):\n        num_buckets = self.num_buckets\n        max_distance = self.max_distance\n        relative_buckets = 0\n\n        if bidirectional:\n            num_buckets = num_buckets // 2\n            relative_buckets += (relative_positions > 0).to(torch.long) * num_buckets\n            relative_positions = torch.abs(relative_positions)\n        else:\n            relative_positions = -torch.min(relative_positions, torch.zeros_like(relative_positions))\n\n        max_exact = num_buckets // 2\n        is_small = relative_positions < max_exact\n\n        relative_postion_if_large = max_exact + (\n            torch.log(relative_positions.float() / max_exact)\n            / math.log(max_distance / max_exact)\n            * (num_buckets - max_exact)\n        ).to(torch.long)\n        relative_postion_if_large = torch.min(\n            relative_postion_if_large, torch.full_like(relative_postion_if_large, num_buckets - 1)\n        )\n\n        relative_buckets += torch.where(is_small, relative_positions, relative_postion_if_large)\n        return relative_buckets\n\n    def compute_bias(self, query_length, key_length):\n        context_position = torch.arange(query_length, dtype=torch.long)[:, None]\n        memory_position = torch.arange(key_length, dtype=torch.long)[None, :]\n        relative_position = memory_position - context_position\n        relative_position_bucket = self._relative_positions_bucket(relative_position, bidirectional=True)\n        relative_position_bucket = relative_position_bucket.to(self.relative_attention_bias.weight.device)\n        values = self.relative_attention_bias(relative_position_bucket)\n        values = values.permute([2, 0, 1])\n        return values\n\n    def forward(\n        self,\n        query,\n        key: Optional[Tensor],\n        value: Optional[Tensor],\n        key_padding_mask: Optional[Tensor] = None,\n        incremental_state: Optional[Dict[str, Dict[str, Optional[Tensor]]]] = None,\n        need_weights: bool = True,\n        static_kv: bool = False,\n        attn_mask: Optional[Tensor] = None,\n        before_softmax: bool = False,\n        need_head_weights: bool = False,\n        position_bias: Optional[Tensor] = None,\n    ) -> Tuple[Tensor, Optional[Tensor], Optional[Tensor]]:\n        \"\"\"Input shape: Time x Batch x Channel\n\n        Args:\n            key_padding_mask (ByteTensor, optional): mask to exclude\n                keys that are pads, of shape `(batch, src_len)`, where\n                padding elements are indicated by 1s.\n            need_weights (bool, optional): return the attention weights,\n                averaged over heads (default: False).\n            attn_mask (ByteTensor, optional): typically used to\n                implement causal attention, where the mask prevents the\n                attention from looking forward in time (default: None).\n            before_softmax (bool, optional): return the raw attention\n                weights and values before the attention softmax.\n            need_head_weights (bool, optional): return the attention\n                weights for each head. Implies *need_weights*. Default:\n                return the average attention weights over all heads.\n        \"\"\"\n        if need_head_weights:\n            need_weights = True\n\n        is_tpu = query.device.type == \"xla\"\n\n        tgt_len, bsz, embed_dim = query.size()\n        src_len = tgt_len\n        assert embed_dim == self.embed_dim\n        assert list(query.size()) == [tgt_len, bsz, embed_dim]\n        if key is not None:\n            src_len, key_bsz, _ = key.size()\n            if not torch.jit.is_scripting():\n                assert key_bsz == bsz\n                assert value is not None\n                assert src_len, bsz == value.shape[:2]\n\n        if self.has_relative_attention_bias and position_bias is None:\n            position_bias = self.compute_bias(tgt_len, src_len)\n            position_bias = position_bias.unsqueeze(0).repeat(bsz, 1, 1, 1).view(bsz * self.num_heads, tgt_len, src_len)\n\n        if (\n            not is_tpu  # don't use PyTorch version on TPUs\n            and incremental_state is None\n            and not static_kv\n            # A workaround for quantization to work. Otherwise JIT compilation\n            # treats bias in linear module as method.\n            and not torch.jit.is_scripting()\n            and self.q_head_dim == self.head_dim\n        ):\n            assert key is not None and value is not None\n            assert attn_mask is None\n\n            attn_mask_rel_pos = None\n            if position_bias is not None:\n                attn_mask_rel_pos = position_bias\n                if self.gru_rel_pos:\n                    query_layer = query.transpose(0, 1)\n                    new_x_shape = query_layer.size()[:-1] + (self.num_heads, -1)\n                    query_layer = query_layer.view(*new_x_shape)\n                    query_layer = query_layer.permute(0, 2, 1, 3)\n                    _B, _H, _L, __ = query_layer.size()\n\n                    gate_a, gate_b = torch.sigmoid(\n                        self.grep_linear(query_layer).view(_B, _H, _L, 2, 4).sum(-1, keepdim=False)\n                    ).chunk(2, dim=-1)\n                    gate_a_1 = gate_a * (gate_b * self.grep_a - 1.0) + 2.0\n                    attn_mask_rel_pos = gate_a_1.view(bsz * self.num_heads, -1, 1) * position_bias\n\n                attn_mask_rel_pos = attn_mask_rel_pos.view((-1, tgt_len, tgt_len))\n            k_proj_bias = self.k_proj.bias\n            if k_proj_bias is None:\n                k_proj_bias = torch.zeros_like(self.q_proj.bias)\n\n            x, attn = F.multi_head_attention_forward(\n                query,\n                key,\n                value,\n                self.embed_dim,\n                self.num_heads,\n                torch.empty([0]),\n                torch.cat((self.q_proj.bias, self.k_proj.bias, self.v_proj.bias)),\n                self.bias_k,\n                self.bias_v,\n                self.add_zero_attn,\n                self.dropout_module.p,\n                self.out_proj.weight,\n                self.out_proj.bias,\n                self.training,\n                # self.training or self.dropout_module.apply_during_inference,\n                key_padding_mask,\n                need_weights,\n                attn_mask_rel_pos,\n                use_separate_proj_weight=True,\n                q_proj_weight=self.q_proj.weight,\n                k_proj_weight=self.k_proj.weight,\n                v_proj_weight=self.v_proj.weight,\n            )\n            return x, attn, position_bias\n\n        if incremental_state is not None:\n            saved_state = self._get_input_buffer(incremental_state)\n            if saved_state is not None and \"prev_key\" in saved_state:\n                # previous time steps are cached - no need to recompute\n                # key and value if they are static\n                if static_kv:\n                    assert self.encoder_decoder_attention and not self.self_attention\n                    key = value = None\n        else:\n            saved_state = None\n\n        if self.self_attention:\n            q = self.q_proj(query)\n            k = self.k_proj(query)\n            v = self.v_proj(query)\n        elif self.encoder_decoder_attention:\n            # encoder-decoder attention\n            q = self.q_proj(query)\n            if key is None:\n                assert value is None\n                k = v = None\n            else:\n                k = self.k_proj(key)\n                v = self.v_proj(key)\n\n        else:\n            assert key is not None and value is not None\n            q = self.q_proj(query)\n            k = self.k_proj(key)\n            v = self.v_proj(value)\n        q *= self.scaling\n\n        if self.bias_k is not None:\n            assert self.bias_v is not None\n            k = torch.cat([k, self.bias_k.repeat(1, bsz, 1)])\n            v = torch.cat([v, self.bias_v.repeat(1, bsz, 1)])\n            if attn_mask is not None:\n                attn_mask = torch.cat([attn_mask, attn_mask.new_zeros(attn_mask.size(0), 1)], dim=1)\n            if key_padding_mask is not None:\n                key_padding_mask = torch.cat(\n                    [\n                        key_padding_mask,\n                        key_padding_mask.new_zeros(key_padding_mask.size(0), 1),\n                    ],\n                    dim=1,\n                )\n\n        q = q.contiguous().view(tgt_len, bsz * self.num_heads, self.q_head_dim).transpose(0, 1)\n        if k is not None:\n            k = k.contiguous().view(-1, bsz * self.num_heads, self.k_head_dim).transpose(0, 1)\n        if v is not None:\n            v = v.contiguous().view(-1, bsz * self.num_heads, self.head_dim).transpose(0, 1)\n\n        if saved_state is not None:\n            # saved states are stored with shape (bsz, num_heads, seq_len, head_dim)\n            if \"prev_key\" in saved_state:\n                _prev_key = saved_state[\"prev_key\"]\n                assert _prev_key is not None\n                prev_key = _prev_key.view(bsz * self.num_heads, -1, self.head_dim)\n                if static_kv:\n                    k = prev_key\n                else:\n                    assert k is not None\n                    k = torch.cat([prev_key, k], dim=1)\n                src_len = k.size(1)\n            if \"prev_value\" in saved_state:\n                _prev_value = saved_state[\"prev_value\"]\n                assert _prev_value is not None\n                prev_value = _prev_value.view(bsz * self.num_heads, -1, self.head_dim)\n                if static_kv:\n                    v = prev_value\n                else:\n                    assert v is not None\n                    v = torch.cat([prev_value, v], dim=1)\n            prev_key_padding_mask: Optional[Tensor] = None\n            if \"prev_key_padding_mask\" in saved_state:\n                prev_key_padding_mask = saved_state[\"prev_key_padding_mask\"]\n            assert k is not None and v is not None\n            key_padding_mask = MultiheadAttention._append_prev_key_padding_mask(\n                key_padding_mask=key_padding_mask,\n                prev_key_padding_mask=prev_key_padding_mask,\n                batch_size=bsz,\n                src_len=k.size(1),\n                static_kv=static_kv,\n            )\n\n            saved_state[\"prev_key\"] = k.view(bsz, self.num_heads, -1, self.head_dim)\n            saved_state[\"prev_value\"] = v.view(bsz, self.num_heads, -1, self.head_dim)\n            saved_state[\"prev_key_padding_mask\"] = key_padding_mask\n            # In this branch incremental_state is never None\n            assert incremental_state is not None\n            incremental_state = self._set_input_buffer(incremental_state, saved_state)\n        assert k is not None\n        assert k.size(1) == src_len\n\n        # This is part of a workaround to get around fork/join parallelism\n        # not supporting Optional types.\n        if key_padding_mask is not None and key_padding_mask.dim() == 0:\n            key_padding_mask = None\n\n        if key_padding_mask is not None:\n            assert key_padding_mask.size(0) == bsz\n            assert key_padding_mask.size(1) == src_len\n\n        if self.add_zero_attn:\n            assert v is not None\n            src_len += 1\n            k = torch.cat([k, k.new_zeros((k.size(0), 1) + k.size()[2:])], dim=1)\n            v = torch.cat([v, v.new_zeros((v.size(0), 1) + v.size()[2:])], dim=1)\n            if attn_mask is not None:\n                attn_mask = torch.cat([attn_mask, attn_mask.new_zeros(attn_mask.size(0), 1)], dim=1)\n            if key_padding_mask is not None:\n                key_padding_mask = torch.cat(\n                    [\n                        key_padding_mask,\n                        torch.zeros(key_padding_mask.size(0), 1).type_as(key_padding_mask),\n                    ],\n                    dim=1,\n                )\n\n        attn_weights = torch.bmm(q, k.transpose(1, 2))\n        attn_weights = self.apply_sparse_mask(attn_weights, tgt_len, src_len, bsz)\n\n        assert list(attn_weights.size()) == [bsz * self.num_heads, tgt_len, src_len]\n\n        if attn_mask is not None:\n            attn_mask = attn_mask.unsqueeze(0)\n            attn_weights += attn_mask\n\n        if key_padding_mask is not None:\n            # don't attend to padding symbols\n            attn_weights = attn_weights.view(bsz, self.num_heads, tgt_len, src_len)\n            if not is_tpu:\n                attn_weights = attn_weights.masked_fill(\n                    key_padding_mask.unsqueeze(1).unsqueeze(2).to(torch.bool),\n                    float(\"-inf\"),\n                )\n            else:\n                attn_weights = attn_weights.transpose(0, 2)\n                attn_weights = attn_weights.masked_fill(key_padding_mask, float(\"-inf\"))\n                attn_weights = attn_weights.transpose(0, 2)\n            attn_weights = attn_weights.view(bsz * self.num_heads, tgt_len, src_len)\n\n        if before_softmax:\n            return attn_weights, v, position_bias\n\n        if position_bias is not None:\n            if self.gru_rel_pos == 1:\n                query_layer = q.view(bsz, self.num_heads, tgt_len, self.q_head_dim)\n                _B, _H, _L, __ = query_layer.size()\n                gate_a, gate_b = torch.sigmoid(\n                    self.grep_linear(query_layer).view(_B, _H, _L, 2, 4).sum(-1, keepdim=False)\n                ).chunk(2, dim=-1)\n                gate_a_1 = gate_a * (gate_b * self.grep_a - 1.0) + 2.0\n                position_bias = gate_a_1.view(bsz * self.num_heads, -1, 1) * position_bias\n\n            position_bias = position_bias.view(attn_weights.size())\n\n            attn_weights = attn_weights + position_bias\n\n        attn_weights_float = F.softmax(attn_weights, dim=-1)\n        attn_weights = attn_weights_float.type_as(attn_weights)\n        attn_probs = self.dropout_module(attn_weights)\n\n        assert v is not None\n        attn = torch.bmm(attn_probs, v)\n        assert list(attn.size()) == [bsz * self.num_heads, tgt_len, self.head_dim]\n        attn = attn.transpose(0, 1).contiguous().view(tgt_len, bsz, embed_dim)\n        attn = self.out_proj(attn)\n        attn_weights: Optional[Tensor] = None\n        if need_weights:\n            attn_weights = attn_weights_float.view(bsz, self.num_heads, tgt_len, src_len).transpose(1, 0)\n            if not need_head_weights:\n                # average attention weights over heads\n                attn_weights = attn_weights.mean(dim=0)\n\n        return attn, attn_weights, position_bias\n\n    @staticmethod\n    def _append_prev_key_padding_mask(\n        key_padding_mask: Optional[Tensor],\n        prev_key_padding_mask: Optional[Tensor],\n        batch_size: int,\n        src_len: int,\n        static_kv: bool,\n    ) -> Optional[Tensor]:\n        # saved key padding masks have shape (bsz, seq_len)\n        if prev_key_padding_mask is not None and static_kv:\n            new_key_padding_mask = prev_key_padding_mask\n        elif prev_key_padding_mask is not None and key_padding_mask is not None:\n            new_key_padding_mask = torch.cat([prev_key_padding_mask.float(), key_padding_mask.float()], dim=1)\n        # During incremental decoding, as the padding token enters and\n        # leaves the frame, there will be a time when prev or current\n        # is None\n        elif prev_key_padding_mask is not None:\n            if src_len > prev_key_padding_mask.size(1):\n                filler = torch.zeros(\n                    (batch_size, src_len - prev_key_padding_mask.size(1)),\n                    device=prev_key_padding_mask.device,\n                )\n                new_key_padding_mask = torch.cat([prev_key_padding_mask.float(), filler.float()], dim=1)\n            else:\n                new_key_padding_mask = prev_key_padding_mask.float()\n        elif key_padding_mask is not None:\n            if src_len > key_padding_mask.size(1):\n                filler = torch.zeros(\n                    (batch_size, src_len - key_padding_mask.size(1)),\n                    device=key_padding_mask.device,\n                )\n                new_key_padding_mask = torch.cat([filler.float(), key_padding_mask.float()], dim=1)\n            else:\n                new_key_padding_mask = key_padding_mask.float()\n        else:\n            new_key_padding_mask = prev_key_padding_mask\n        return new_key_padding_mask\n\n    def _get_input_buffer(\n        self, incremental_state: Optional[Dict[str, Dict[str, Optional[Tensor]]]]\n    ) -> Dict[str, Optional[Tensor]]:\n        result = self.get_incremental_state(incremental_state, \"attn_state\")\n        if result is not None:\n            return result\n        else:\n            empty_result: Dict[str, Optional[Tensor]] = {}\n            return empty_result\n\n    def _set_input_buffer(\n        self,\n        incremental_state: Dict[str, Dict[str, Optional[Tensor]]],\n        buffer: Dict[str, Optional[Tensor]],\n    ):\n        return self.set_incremental_state(incremental_state, \"attn_state\", buffer)\n\n    def apply_sparse_mask(self, attn_weights, tgt_len: int, src_len: int, bsz: int):\n        return attn_weights\n"
  },
  {
    "path": "TTS/vc/modules/freevc/wavlm/wavlm.py",
    "content": "# --------------------------------------------------------\n# WavLM: Large-Scale Self-Supervised  Pre-training  for Full Stack Speech Processing (https://arxiv.org/abs/2110.13900.pdf)\n# Github source: https://github.com/microsoft/unilm/tree/master/wavlm\n# Copyright (c) 2021 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Based on fairseq code bases\n# https://github.com/pytorch/fairseq\n# --------------------------------------------------------\n\nimport logging\nimport math\nfrom typing import List, Optional, Tuple\n\nimport numpy as np\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom torch.nn import LayerNorm\n\nfrom TTS.vc.modules.freevc.wavlm.modules import (\n    Fp32GroupNorm,\n    Fp32LayerNorm,\n    GLU_Linear,\n    GradMultiply,\n    MultiheadAttention,\n    SamePad,\n    TransposeLast,\n    get_activation_fn,\n    init_bert_params,\n)\n\nlogger = logging.getLogger(__name__)\n\n\ndef compute_mask_indices(\n    shape: Tuple[int, int],\n    padding_mask: Optional[torch.Tensor],\n    mask_prob: float,\n    mask_length: int,\n    mask_type: str = \"static\",\n    mask_other: float = 0.0,\n    min_masks: int = 0,\n    no_overlap: bool = False,\n    min_space: int = 0,\n) -> np.ndarray:\n    \"\"\"\n    Computes random mask spans for a given shape\n\n    Args:\n        shape: the the shape for which to compute masks.\n            should be of size 2 where first element is batch size and 2nd is timesteps\n        padding_mask: optional padding mask of the same size as shape, which will prevent masking padded elements\n        mask_prob: probability for each token to be chosen as start of the span to be masked. this will be multiplied by\n            number of timesteps divided by length of mask span to mask approximately this percentage of all elements.\n            however due to overlaps, the actual number will be smaller (unless no_overlap is True)\n        mask_type: how to compute mask lengths\n            static = fixed size\n            uniform = sample from uniform distribution [mask_other, mask_length*2]\n            normal = sample from normal distribution with mean mask_length and stdev mask_other. mask is min 1 element\n            poisson = sample from possion distribution with lambda = mask length\n        min_masks: minimum number of masked spans\n        no_overlap: if false, will switch to an alternative recursive algorithm that prevents spans from overlapping\n        min_space: only used if no_overlap is True, this is how many elements to keep unmasked between spans\n    \"\"\"\n\n    bsz, all_sz = shape\n    mask = np.full((bsz, all_sz), False)\n\n    all_num_mask = int(\n        # add a random number for probabilistic rounding\n        mask_prob * all_sz / float(mask_length)\n        + np.random.rand()\n    )\n\n    all_num_mask = max(min_masks, all_num_mask)\n\n    mask_idcs = []\n    for i in range(bsz):\n        if padding_mask is not None:\n            sz = all_sz - padding_mask[i].long().sum().item()\n            num_mask = int(\n                # add a random number for probabilistic rounding\n                mask_prob * sz / float(mask_length)\n                + np.random.rand()\n            )\n            num_mask = max(min_masks, num_mask)\n        else:\n            sz = all_sz\n            num_mask = all_num_mask\n\n        if mask_type == \"static\":\n            lengths = np.full(num_mask, mask_length)\n        elif mask_type == \"uniform\":\n            lengths = np.random.randint(mask_other, mask_length * 2 + 1, size=num_mask)\n        elif mask_type == \"normal\":\n            lengths = np.random.normal(mask_length, mask_other, size=num_mask)\n            lengths = [max(1, int(round(x))) for x in lengths]\n        elif mask_type == \"poisson\":\n            lengths = np.random.poisson(mask_length, size=num_mask)\n            lengths = [int(round(x)) for x in lengths]\n        else:\n            raise Exception(\"unknown mask selection \" + mask_type)\n\n        if sum(lengths) == 0:\n            lengths[0] = min(mask_length, sz - 1)\n\n        if no_overlap:\n            mask_idc = []\n\n            def arrange(s, e, length, keep_length):\n                span_start = np.random.randint(s, e - length)\n                mask_idc.extend(span_start + i for i in range(length))\n\n                new_parts = []\n                if span_start - s - min_space >= keep_length:\n                    new_parts.append((s, span_start - min_space + 1))\n                if e - span_start - keep_length - min_space > keep_length:\n                    new_parts.append((span_start + length + min_space, e))\n                return new_parts\n\n            parts = [(0, sz)]\n            min_length = min(lengths)\n            for length in sorted(lengths, reverse=True):\n                lens = np.fromiter(\n                    (e - s if e - s >= length + min_space else 0 for s, e in parts),\n                    np.int,\n                )\n                l_sum = np.sum(lens)\n                if l_sum == 0:\n                    break\n                probs = lens / np.sum(lens)\n                c = np.random.choice(len(parts), p=probs)\n                s, e = parts.pop(c)\n                parts.extend(arrange(s, e, length, min_length))\n            mask_idc = np.asarray(mask_idc)\n        else:\n            min_len = min(lengths)\n            if sz - min_len <= num_mask:\n                min_len = sz - num_mask - 1\n\n            mask_idc = np.random.choice(sz - min_len, num_mask, replace=False)\n\n            mask_idc = np.asarray([mask_idc[j] + offset for j in range(len(mask_idc)) for offset in range(lengths[j])])\n\n        mask_idcs.append(np.unique(mask_idc[mask_idc < sz]))\n\n    min_len = min([len(m) for m in mask_idcs])\n    for i, mask_idc in enumerate(mask_idcs):\n        if len(mask_idc) > min_len:\n            mask_idc = np.random.choice(mask_idc, min_len, replace=False)\n        mask[i, mask_idc] = True\n\n    return mask\n\n\nclass WavLMConfig:\n    def __init__(self, cfg=None):\n        self.extractor_mode: str = \"default\"  # mode for feature extractor. default has a single group norm with d groups in the first conv block, whereas layer_norm has layer norms in every block (meant to use with normalize=True)\n        self.encoder_layers: int = 12  # num encoder layers in the transformer\n\n        self.encoder_embed_dim: int = 768  # encoder embedding dimension\n        self.encoder_ffn_embed_dim: int = 3072  # encoder embedding dimension for FFN\n        self.encoder_attention_heads: int = 12  # num encoder attention heads\n        self.activation_fn: str = \"gelu\"  # activation function to use\n\n        self.layer_norm_first: bool = False  # apply layernorm first in the transformer\n        self.conv_feature_layers: str = \"[(512,10,5)] + [(512,3,2)] * 4 + [(512,2,2)] * 2\"  # string describing convolutional feature extraction layers in form of a python list that contains [(dim, kernel_size, stride), ...]\n        self.conv_bias: bool = False  # include bias in conv encoder\n        self.feature_grad_mult: float = 1.0  # multiply feature extractor var grads by this\n\n        self.normalize: bool = False  # normalize input to have 0 mean and unit variance during training\n\n        # dropouts\n        self.dropout: float = 0.1  # dropout probability for the transformer\n        self.attention_dropout: float = 0.1  # dropout probability for attention weights\n        self.activation_dropout: float = 0.0  # dropout probability after activation in FFN\n        self.encoder_layerdrop: float = 0.0  # probability of dropping a tarnsformer layer\n        self.dropout_input: float = 0.0  # dropout to apply to the input (after feat extr)\n        self.dropout_features: float = 0.0  # dropout to apply to the features (after feat extr)\n\n        # masking\n        self.mask_length: int = 10  # mask length\n        self.mask_prob: float = 0.65  # probability of replacing a token with mask\n        self.mask_selection: str = \"static\"  # how to choose mask length\n        self.mask_other: float = (\n            0  # secondary mask argument (used for more complex distributions), see help in compute_mask_indicesh\n        )\n        self.no_mask_overlap: bool = False  # whether to allow masks to overlap\n        self.mask_min_space: int = 1  # min space between spans (if no overlap is enabled)\n\n        # channel masking\n        self.mask_channel_length: int = 10  # length of the mask for features (channels)\n        self.mask_channel_prob: float = 0.0  # probability of replacing a feature with 0\n        self.mask_channel_selection: str = \"static\"  # how to choose mask length for channel masking\n        self.mask_channel_other: float = (\n            0  # secondary mask argument (used for more complex distributions), see help in compute_mask_indices\n        )\n        self.no_mask_channel_overlap: bool = False  # whether to allow channel masks to overlap\n        self.mask_channel_min_space: int = 1  # min space between spans (if no overlap is enabled)\n\n        # positional embeddings\n        self.conv_pos: int = 128  # number of filters for convolutional positional embeddings\n        self.conv_pos_groups: int = 16  # number of groups for convolutional positional embedding\n\n        # relative position embedding\n        self.relative_position_embedding: bool = False  # apply relative position embedding\n        self.num_buckets: int = 320  # number of buckets for relative position embedding\n        self.max_distance: int = 1280  # maximum distance for relative position embedding\n        self.gru_rel_pos: bool = False  # apply gated relative position embedding\n\n        if cfg is not None:\n            self.update(cfg)\n\n    def update(self, cfg: dict):\n        self.__dict__.update(cfg)\n\n\nclass WavLM(nn.Module):\n    def __init__(\n        self,\n        cfg: WavLMConfig,\n    ) -> None:\n        super().__init__()\n        logger.info(f\"WavLM Config: {cfg.__dict__}\")\n\n        self.cfg = cfg\n        feature_enc_layers = eval(cfg.conv_feature_layers)\n        self.embed = feature_enc_layers[-1][0]\n\n        self.feature_extractor = ConvFeatureExtractionModel(\n            conv_layers=feature_enc_layers,\n            dropout=0.0,\n            mode=cfg.extractor_mode,\n            conv_bias=cfg.conv_bias,\n        )\n\n        self.post_extract_proj = (\n            nn.Linear(self.embed, cfg.encoder_embed_dim) if self.embed != cfg.encoder_embed_dim else None\n        )\n\n        self.mask_prob = cfg.mask_prob\n        self.mask_selection = cfg.mask_selection\n        self.mask_other = cfg.mask_other\n        self.mask_length = cfg.mask_length\n        self.no_mask_overlap = cfg.no_mask_overlap\n        self.mask_min_space = cfg.mask_min_space\n\n        self.mask_channel_prob = cfg.mask_channel_prob\n        self.mask_channel_selection = cfg.mask_channel_selection\n        self.mask_channel_other = cfg.mask_channel_other\n        self.mask_channel_length = cfg.mask_channel_length\n        self.no_mask_channel_overlap = cfg.no_mask_channel_overlap\n        self.mask_channel_min_space = cfg.mask_channel_min_space\n\n        self.dropout_input = nn.Dropout(cfg.dropout_input)\n        self.dropout_features = nn.Dropout(cfg.dropout_features)\n\n        self.feature_grad_mult = cfg.feature_grad_mult\n\n        self.mask_emb = nn.Parameter(torch.FloatTensor(cfg.encoder_embed_dim).uniform_())\n\n        self.encoder = TransformerEncoder(cfg)\n        self.layer_norm = LayerNorm(self.embed)\n\n    def apply_mask(self, x, padding_mask):\n        B, T, C = x.shape\n        if self.mask_prob > 0:\n            mask_indices = compute_mask_indices(\n                (B, T),\n                padding_mask,\n                self.mask_prob,\n                self.mask_length,\n                self.mask_selection,\n                self.mask_other,\n                min_masks=2,\n                no_overlap=self.no_mask_overlap,\n                min_space=self.mask_min_space,\n            )\n            mask_indices = torch.from_numpy(mask_indices).to(x.device)\n            x[mask_indices] = self.mask_emb\n        else:\n            mask_indices = None\n\n        if self.mask_channel_prob > 0:\n            mask_channel_indices = compute_mask_indices(\n                (B, C),\n                None,\n                self.mask_channel_prob,\n                self.mask_channel_length,\n                self.mask_channel_selection,\n                self.mask_channel_other,\n                no_overlap=self.no_mask_channel_overlap,\n                min_space=self.mask_channel_min_space,\n            )\n            mask_channel_indices = torch.from_numpy(mask_channel_indices).to(x.device).unsqueeze(1).expand(-1, T, -1)\n            x[mask_channel_indices] = 0\n\n        return x, mask_indices\n\n    def forward_padding_mask(\n        self,\n        features: torch.Tensor,\n        padding_mask: torch.Tensor,\n    ) -> torch.Tensor:\n        extra = padding_mask.size(1) % features.size(1)\n        if extra > 0:\n            padding_mask = padding_mask[:, :-extra]\n        padding_mask = padding_mask.view(padding_mask.size(0), features.size(1), -1)\n        # padding_mask = padding_mask.all(-1)\n        padding_mask = padding_mask.any(-1)\n        return padding_mask\n\n    def extract_features(\n        self,\n        source: torch.Tensor,\n        padding_mask: Optional[torch.Tensor] = None,\n        mask: bool = False,\n        ret_conv: bool = False,\n        output_layer: Optional[int] = None,\n        ret_layer_results: bool = False,\n    ):\n        if self.feature_grad_mult > 0:\n            features = self.feature_extractor(source)\n            if self.feature_grad_mult != 1.0:\n                features = GradMultiply.apply(features, self.feature_grad_mult)\n        else:\n            with torch.no_grad():\n                features = self.feature_extractor(source)\n\n        features = features.transpose(1, 2)\n        features = self.layer_norm(features)\n\n        if padding_mask is not None:\n            padding_mask = self.forward_padding_mask(features, padding_mask)\n\n        if self.post_extract_proj is not None:\n            features = self.post_extract_proj(features)\n\n        features = self.dropout_input(features)\n\n        if mask:\n            x, mask_indices = self.apply_mask(features, padding_mask)\n        else:\n            x = features\n\n        # feature: (B, T, D), float\n        # target: (B, T), long\n        # x: (B, T, D), float\n        # padding_mask: (B, T), bool\n        # mask_indices: (B, T), bool\n        x, layer_results = self.encoder(\n            x, padding_mask=padding_mask, layer=None if output_layer is None else output_layer - 1\n        )\n\n        res = {\"x\": x, \"padding_mask\": padding_mask, \"features\": features, \"layer_results\": layer_results}\n\n        feature = res[\"features\"] if ret_conv else res[\"x\"]\n        if ret_layer_results:\n            feature = (feature, res[\"layer_results\"])\n        return feature, res[\"padding_mask\"]\n\n\nclass ConvFeatureExtractionModel(nn.Module):\n    def __init__(\n        self,\n        conv_layers: List[Tuple[int, int, int]],\n        dropout: float = 0.0,\n        mode: str = \"default\",\n        conv_bias: bool = False,\n        conv_type: str = \"default\",\n    ):\n        super().__init__()\n\n        assert mode in {\"default\", \"layer_norm\"}\n\n        def block(\n            n_in,\n            n_out,\n            k,\n            stride,\n            is_layer_norm=False,\n            is_group_norm=False,\n            conv_bias=False,\n        ):\n            def make_conv():\n                conv = nn.Conv1d(n_in, n_out, k, stride=stride, bias=conv_bias)\n                nn.init.kaiming_normal_(conv.weight)\n                return conv\n\n            assert (is_layer_norm and is_group_norm) == False, \"layer norm and group norm are exclusive\"\n\n            if is_layer_norm:\n                return nn.Sequential(\n                    make_conv(),\n                    nn.Dropout(p=dropout),\n                    nn.Sequential(\n                        TransposeLast(),\n                        Fp32LayerNorm(dim, elementwise_affine=True),\n                        TransposeLast(),\n                    ),\n                    nn.GELU(),\n                )\n            elif is_group_norm:\n                return nn.Sequential(\n                    make_conv(),\n                    nn.Dropout(p=dropout),\n                    Fp32GroupNorm(dim, dim, affine=True),\n                    nn.GELU(),\n                )\n            else:\n                return nn.Sequential(make_conv(), nn.Dropout(p=dropout), nn.GELU())\n\n        self.conv_type = conv_type\n        if self.conv_type == \"default\":\n            in_d = 1\n            self.conv_layers = nn.ModuleList()\n            for i, cl in enumerate(conv_layers):\n                assert len(cl) == 3, \"invalid conv definition: \" + str(cl)\n                (dim, k, stride) = cl\n\n                self.conv_layers.append(\n                    block(\n                        in_d,\n                        dim,\n                        k,\n                        stride,\n                        is_layer_norm=mode == \"layer_norm\",\n                        is_group_norm=mode == \"default\" and i == 0,\n                        conv_bias=conv_bias,\n                    )\n                )\n                in_d = dim\n        elif self.conv_type == \"conv2d\":\n            in_d = 1\n            self.conv_layers = nn.ModuleList()\n            for i, cl in enumerate(conv_layers):\n                assert len(cl) == 3\n                (dim, k, stride) = cl\n\n                self.conv_layers.append(torch.nn.Conv2d(in_d, dim, k, stride))\n                self.conv_layers.append(torch.nn.ReLU())\n                in_d = dim\n        elif self.conv_type == \"custom\":\n            in_d = 1\n            idim = 80\n            self.conv_layers = nn.ModuleList()\n            for i, cl in enumerate(conv_layers):\n                assert len(cl) == 3\n                (dim, k, stride) = cl\n                self.conv_layers.append(torch.nn.Conv2d(in_d, dim, k, stride, padding=1))\n                self.conv_layers.append(torch.nn.LayerNorm([dim, idim]))\n                self.conv_layers.append(torch.nn.ReLU())\n                in_d = dim\n                if (i + 1) % 2 == 0:\n                    self.conv_layers.append(torch.nn.MaxPool2d(2, stride=2, ceil_mode=True))\n                    idim = int(math.ceil(idim / 2))\n        else:\n            pass\n\n    def forward(self, x, mask=None):\n        # BxT -> BxCxT\n        x = x.unsqueeze(1)\n        if self.conv_type == \"custom\":\n            for conv in self.conv_layers:\n                if isinstance(conv, nn.LayerNorm):\n                    x = x.transpose(1, 2)\n                    x = conv(x).transpose(1, 2)\n                else:\n                    x = conv(x)\n            x = x.transpose(2, 3).contiguous()\n            x = x.view(x.size(0), -1, x.size(-1))\n        else:\n            for conv in self.conv_layers:\n                x = conv(x)\n            if self.conv_type == \"conv2d\":\n                b, c, t, f = x.size()\n                x = x.transpose(2, 3).contiguous().view(b, c * f, t)\n        return x\n\n\nclass TransformerEncoder(nn.Module):\n    def __init__(self, args):\n        super().__init__()\n\n        self.dropout = args.dropout\n        self.embedding_dim = args.encoder_embed_dim\n\n        self.pos_conv = nn.Conv1d(\n            self.embedding_dim,\n            self.embedding_dim,\n            kernel_size=args.conv_pos,\n            padding=args.conv_pos // 2,\n            groups=args.conv_pos_groups,\n        )\n        dropout = 0\n        std = math.sqrt((4 * (1.0 - dropout)) / (args.conv_pos * self.embedding_dim))\n        nn.init.normal_(self.pos_conv.weight, mean=0, std=std)\n        nn.init.constant_(self.pos_conv.bias, 0)\n\n        self.pos_conv = nn.utils.weight_norm(self.pos_conv, name=\"weight\", dim=2)\n        self.pos_conv = nn.Sequential(self.pos_conv, SamePad(args.conv_pos), nn.GELU())\n\n        if hasattr(args, \"relative_position_embedding\"):\n            self.relative_position_embedding = args.relative_position_embedding\n            self.num_buckets = args.num_buckets\n            self.max_distance = args.max_distance\n        else:\n            self.relative_position_embedding = False\n            self.num_buckets = 0\n            self.max_distance = 0\n\n        self.layers = nn.ModuleList(\n            [\n                TransformerSentenceEncoderLayer(\n                    embedding_dim=self.embedding_dim,\n                    ffn_embedding_dim=args.encoder_ffn_embed_dim,\n                    num_attention_heads=args.encoder_attention_heads,\n                    dropout=self.dropout,\n                    attention_dropout=args.attention_dropout,\n                    activation_dropout=args.activation_dropout,\n                    activation_fn=args.activation_fn,\n                    layer_norm_first=args.layer_norm_first,\n                    has_relative_attention_bias=(self.relative_position_embedding and i == 0),\n                    num_buckets=self.num_buckets,\n                    max_distance=self.max_distance,\n                    gru_rel_pos=args.gru_rel_pos,\n                )\n                for i in range(args.encoder_layers)\n            ]\n        )\n\n        self.layer_norm_first = args.layer_norm_first\n        self.layer_norm = LayerNorm(self.embedding_dim)\n        self.layerdrop = args.encoder_layerdrop\n\n        self.apply(init_bert_params)\n\n    def forward(self, x, padding_mask=None, streaming_mask=None, layer=None):\n        x, layer_results = self.extract_features(x, padding_mask, streaming_mask, layer)\n\n        if self.layer_norm_first and layer is None:\n            x = self.layer_norm(x)\n\n        return x, layer_results\n\n    def extract_features(self, x, padding_mask=None, streaming_mask=None, tgt_layer=None):\n        if padding_mask is not None:\n            x[padding_mask] = 0\n\n        x_conv = self.pos_conv(x.transpose(1, 2))\n        x_conv = x_conv.transpose(1, 2)\n        x += x_conv\n\n        if not self.layer_norm_first:\n            x = self.layer_norm(x)\n\n        x = F.dropout(x, p=self.dropout, training=self.training)\n\n        # B x T x C -> T x B x C\n        x = x.transpose(0, 1)\n\n        layer_results = []\n        z = None\n        if tgt_layer is not None:\n            layer_results.append((x, z))\n        r = None\n        pos_bias = None\n        for i, layer in enumerate(self.layers):\n            dropout_probability = np.random.random()\n            if not self.training or (dropout_probability > self.layerdrop):\n                x, z, pos_bias = layer(\n                    x,\n                    self_attn_padding_mask=padding_mask,\n                    need_weights=False,\n                    self_attn_mask=streaming_mask,\n                    pos_bias=pos_bias,\n                )\n            if tgt_layer is not None:\n                layer_results.append((x, z))\n            if i == tgt_layer:\n                r = x\n                break\n\n        if r is not None:\n            x = r\n\n        # T x B x C -> B x T x C\n        x = x.transpose(0, 1)\n\n        return x, layer_results\n\n\nclass TransformerSentenceEncoderLayer(nn.Module):\n    \"\"\"\n    Implements a Transformer Encoder Layer used in BERT/XLM style pre-trained\n    models.\n    \"\"\"\n\n    def __init__(\n        self,\n        embedding_dim: float = 768,\n        ffn_embedding_dim: float = 3072,\n        num_attention_heads: float = 8,\n        dropout: float = 0.1,\n        attention_dropout: float = 0.1,\n        activation_dropout: float = 0.1,\n        activation_fn: str = \"relu\",\n        layer_norm_first: bool = False,\n        has_relative_attention_bias: bool = False,\n        num_buckets: int = 0,\n        max_distance: int = 0,\n        rescale_init: bool = False,\n        gru_rel_pos: bool = False,\n    ) -> None:\n        super().__init__()\n        # Initialize parameters\n        self.embedding_dim = embedding_dim\n        self.dropout = dropout\n        self.activation_dropout = activation_dropout\n\n        # Initialize blocks\n        self.activation_name = activation_fn\n        self.activation_fn = get_activation_fn(activation_fn)\n        self.self_attn = MultiheadAttention(\n            self.embedding_dim,\n            num_attention_heads,\n            dropout=attention_dropout,\n            self_attention=True,\n            has_relative_attention_bias=has_relative_attention_bias,\n            num_buckets=num_buckets,\n            max_distance=max_distance,\n            rescale_init=rescale_init,\n            gru_rel_pos=gru_rel_pos,\n        )\n\n        self.dropout1 = nn.Dropout(dropout)\n        self.dropout2 = nn.Dropout(self.activation_dropout)\n        self.dropout3 = nn.Dropout(dropout)\n\n        self.layer_norm_first = layer_norm_first\n\n        # layer norm associated with the self attention layer\n        self.self_attn_layer_norm = LayerNorm(self.embedding_dim)\n\n        if self.activation_name == \"glu\":\n            self.fc1 = GLU_Linear(self.embedding_dim, ffn_embedding_dim, \"swish\")\n        else:\n            self.fc1 = nn.Linear(self.embedding_dim, ffn_embedding_dim)\n        self.fc2 = nn.Linear(ffn_embedding_dim, self.embedding_dim)\n\n        # layer norm associated with the position wise feed-forward NN\n        self.final_layer_norm = LayerNorm(self.embedding_dim)\n\n    def forward(\n        self,\n        x: torch.Tensor,\n        self_attn_mask: torch.Tensor = None,\n        self_attn_padding_mask: torch.Tensor = None,\n        need_weights: bool = False,\n        pos_bias=None,\n    ):\n        \"\"\"\n        LayerNorm is applied either before or after the self-attention/ffn\n        modules similar to the original Transformer imlementation.\n        \"\"\"\n        residual = x\n\n        if self.layer_norm_first:\n            x = self.self_attn_layer_norm(x)\n            x, attn, pos_bias = self.self_attn(\n                query=x,\n                key=x,\n                value=x,\n                key_padding_mask=self_attn_padding_mask,\n                need_weights=False,\n                attn_mask=self_attn_mask,\n                position_bias=pos_bias,\n            )\n            x = self.dropout1(x)\n            x = residual + x\n\n            residual = x\n            x = self.final_layer_norm(x)\n            if self.activation_name == \"glu\":\n                x = self.fc1(x)\n            else:\n                x = self.activation_fn(self.fc1(x))\n            x = self.dropout2(x)\n            x = self.fc2(x)\n            x = self.dropout3(x)\n            x = residual + x\n        else:\n            x, attn, pos_bias = self.self_attn(\n                query=x,\n                key=x,\n                value=x,\n                key_padding_mask=self_attn_padding_mask,\n                need_weights=need_weights,\n                attn_mask=self_attn_mask,\n                position_bias=pos_bias,\n            )\n\n            x = self.dropout1(x)\n            x = residual + x\n\n            x = self.self_attn_layer_norm(x)\n\n            residual = x\n            if self.activation_name == \"glu\":\n                x = self.fc1(x)\n            else:\n                x = self.activation_fn(self.fc1(x))\n            x = self.dropout2(x)\n            x = self.fc2(x)\n            x = self.dropout3(x)\n            x = residual + x\n            x = self.final_layer_norm(x)\n\n        return x, attn, pos_bias\n"
  },
  {
    "path": "TTS/vocoder/README.md",
    "content": "# Mozilla TTS Vocoders (Experimental)\n\nHere there are vocoder model implementations which can be combined with the other TTS models.\n\nCurrently, following models are implemented:\n\n- Melgan\n- MultiBand-Melgan\n- ParallelWaveGAN\n- GAN-TTS (Discriminator Only)\n\nIt is also very easy to adapt different vocoder models as we provide a flexible and modular (but not too modular) framework.\n\n## Training a model\n\nYou can see here an example (Soon)[Colab Notebook]() training MelGAN with LJSpeech dataset.\n\nIn order to train a new model, you need to gather all wav files into a folder and give this folder to `data_path` in '''config.json'''\n\nYou need to define other relevant parameters in your ```config.json``` and then start traning with the following command.\n\n```CUDA_VISIBLE_DEVICES='0' python tts/bin/train_vocoder.py --config_path path/to/config.json```\n\nExample config files can be found under `tts/vocoder/configs/` folder.\n\nYou can continue a previous training run by the following command.\n\n```CUDA_VISIBLE_DEVICES='0' python tts/bin/train_vocoder.py --continue_path path/to/your/model/folder```\n\nYou can fine-tune a pre-trained model by the following command.\n\n```CUDA_VISIBLE_DEVICES='0' python tts/bin/train_vocoder.py --restore_path path/to/your/model.pth```\n\nRestoring a model starts a new training in a different folder. It only restores model weights with the given checkpoint file. However, continuing a training starts from the same directory where the previous training run left off.\n\nYou can also follow your training runs on Tensorboard as you do with our TTS models.\n\n## Acknowledgement\nThanks to @kan-bayashi for his [repository](https://github.com/kan-bayashi/ParallelWaveGAN) being the start point of our work.\n"
  },
  {
    "path": "TTS/vocoder/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/vocoder/configs/__init__.py",
    "content": "import importlib\nimport os\nfrom inspect import isclass\n\n# import all files under configs/\nconfigs_dir = os.path.dirname(__file__)\nfor file in os.listdir(configs_dir):\n    path = os.path.join(configs_dir, file)\n    if not file.startswith(\"_\") and not file.startswith(\".\") and (file.endswith(\".py\") or os.path.isdir(path)):\n        config_name = file[: file.find(\".py\")] if file.endswith(\".py\") else file\n        module = importlib.import_module(\"TTS.vocoder.configs.\" + config_name)\n        for attribute_name in dir(module):\n            attribute = getattr(module, attribute_name)\n\n            if isclass(attribute):\n                # Add the class to this package's variables\n                globals()[attribute_name] = attribute\n"
  },
  {
    "path": "TTS/vocoder/configs/fullband_melgan_config.py",
    "content": "from dataclasses import dataclass, field\n\nfrom .shared_configs import BaseGANVocoderConfig\n\n\n@dataclass\nclass FullbandMelganConfig(BaseGANVocoderConfig):\n    \"\"\"Defines parameters for FullBand MelGAN vocoder.\n\n    Example:\n\n        >>> from TTS.vocoder.configs import FullbandMelganConfig\n        >>> config = FullbandMelganConfig()\n\n    Args:\n        model (str):\n            Model name used for selecting the right model at initialization. Defaults to `fullband_melgan`.\n        discriminator_model (str): One of the discriminators from `TTS.vocoder.models.*_discriminator`. Defaults to\n            'melgan_multiscale_discriminator`.\n        discriminator_model_params (dict): The discriminator model parameters. Defaults to\n            '{\"base_channels\": 16, \"max_channels\": 1024, \"downsample_factors\": [4, 4, 4, 4]}`\n        generator_model (str): One of the generators from TTS.vocoder.models.*`. Every other non-GAN vocoder model is\n            considered as a generator too. Defaults to `melgan_generator`.\n        batch_size (int):\n            Batch size used at training. Larger values use more memory. Defaults to 16.\n        seq_len (int):\n            Audio segment length used at training. Larger values use more memory. Defaults to 8192.\n        pad_short (int):\n            Additional padding applied to the audio samples shorter than `seq_len`. Defaults to 0.\n        use_noise_augment (bool):\n            enable / disable random noise added to the input waveform. The noise is added after computing the\n            features. Defaults to True.\n        use_cache (bool):\n            enable / disable in memory caching of the computed features. It can cause OOM error if the system RAM is\n            not large enough. Defaults to True.\n        use_stft_loss (bool):\n            enable / disable use of STFT loss originally used by ParallelWaveGAN model. Defaults to True.\n        use_subband_stft (bool):\n            enable / disable use of subband loss computation originally used by MultiBandMelgan model. Defaults to True.\n        use_mse_gan_loss (bool):\n            enable / disable using Mean Squeare Error GAN loss. Defaults to True.\n        use_hinge_gan_loss (bool):\n            enable / disable using Hinge GAN loss. You should choose either Hinge or MSE loss for training GAN models.\n            Defaults to False.\n        use_feat_match_loss (bool):\n            enable / disable using Feature Matching loss originally used by MelGAN model. Defaults to True.\n        use_l1_spec_loss (bool):\n            enable / disable using L1 spectrogram loss originally used by HifiGAN model. Defaults to False.\n        stft_loss_params (dict): STFT loss parameters. Default to\n        `{\"n_ffts\": [1024, 2048, 512], \"hop_lengths\": [120, 240, 50], \"win_lengths\": [600, 1200, 240]}`\n        stft_loss_weight (float): STFT loss weight that multiplies the computed loss before summing up the total\n            model loss. Defaults to 0.5.\n        subband_stft_loss_weight (float):\n            Subband STFT loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n        mse_G_loss_weight (float):\n            MSE generator loss weight that multiplies the computed loss before summing up the total loss. faults to 2.5.\n        hinge_G_loss_weight (float):\n            Hinge generator loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n        feat_match_loss_weight (float):\n            Feature matching loss weight that multiplies the computed loss before summing up the total loss. faults to 108.\n        l1_spec_loss_weight (float):\n            L1 spectrogram loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n    \"\"\"\n\n    model: str = \"fullband_melgan\"\n\n    # Model specific params\n    discriminator_model: str = \"melgan_multiscale_discriminator\"\n    discriminator_model_params: dict = field(\n        default_factory=lambda: {\"base_channels\": 16, \"max_channels\": 512, \"downsample_factors\": [4, 4, 4]}\n    )\n    generator_model: str = \"melgan_generator\"\n    generator_model_params: dict = field(\n        default_factory=lambda: {\"upsample_factors\": [8, 8, 2, 2], \"num_res_blocks\": 4}\n    )\n\n    # Training - overrides\n    batch_size: int = 16\n    seq_len: int = 8192\n    pad_short: int = 2000\n    use_noise_augment: bool = True\n    use_cache: bool = True\n\n    # LOSS PARAMETERS - overrides\n    use_stft_loss: bool = True\n    use_subband_stft_loss: bool = False\n    use_mse_gan_loss: bool = True\n    use_hinge_gan_loss: bool = False\n    use_feat_match_loss: bool = True  # requires MelGAN Discriminators (MelGAN and HifiGAN)\n    use_l1_spec_loss: bool = False\n\n    stft_loss_params: dict = field(\n        default_factory=lambda: {\n            \"n_ffts\": [1024, 2048, 512],\n            \"hop_lengths\": [120, 240, 50],\n            \"win_lengths\": [600, 1200, 240],\n        }\n    )\n\n    # loss weights - overrides\n    stft_loss_weight: float = 0.5\n    subband_stft_loss_weight: float = 0\n    mse_G_loss_weight: float = 2.5\n    hinge_G_loss_weight: float = 0\n    feat_match_loss_weight: float = 108\n    l1_spec_loss_weight: float = 0.0\n"
  },
  {
    "path": "TTS/vocoder/configs/hifigan_config.py",
    "content": "from dataclasses import dataclass, field\n\nfrom TTS.vocoder.configs.shared_configs import BaseGANVocoderConfig\n\n\n@dataclass\nclass HifiganConfig(BaseGANVocoderConfig):\n    \"\"\"Defines parameters for FullBand MelGAN vocoder.\n\n    Example:\n\n        >>> from TTS.vocoder.configs import HifiganConfig\n        >>> config = HifiganConfig()\n\n    Args:\n        model (str):\n            Model name used for selecting the right model at initialization. Defaults to `hifigan`.\n        discriminator_model (str): One of the discriminators from `TTS.vocoder.models.*_discriminator`. Defaults to\n            'hifigan_discriminator`.\n        generator_model (str): One of the generators from TTS.vocoder.models.*`. Every other non-GAN vocoder model is\n            considered as a generator too. Defaults to `hifigan_generator`.\n        generator_model_params (dict): Parameters of the generator model. Defaults to\n            `\n            {\n                \"upsample_factors\": [8, 8, 2, 2],\n                \"upsample_kernel_sizes\": [16, 16, 4, 4],\n                \"upsample_initial_channel\": 512,\n                \"resblock_kernel_sizes\": [3, 7, 11],\n                \"resblock_dilation_sizes\": [[1, 3, 5], [1, 3, 5], [1, 3, 5]],\n                \"resblock_type\": \"1\",\n            }\n            `\n        batch_size (int):\n            Batch size used at training. Larger values use more memory. Defaults to 16.\n        seq_len (int):\n            Audio segment length used at training. Larger values use more memory. Defaults to 8192.\n        pad_short (int):\n            Additional padding applied to the audio samples shorter than `seq_len`. Defaults to 0.\n        use_noise_augment (bool):\n            enable / disable random noise added to the input waveform. The noise is added after computing the\n            features. Defaults to True.\n        use_cache (bool):\n            enable / disable in memory caching of the computed features. It can cause OOM error if the system RAM is\n            not large enough. Defaults to True.\n        use_stft_loss (bool):\n            enable / disable use of STFT loss originally used by ParallelWaveGAN model. Defaults to True.\n        use_subband_stft (bool):\n            enable / disable use of subband loss computation originally used by MultiBandMelgan model. Defaults to True.\n        use_mse_gan_loss (bool):\n            enable / disable using Mean Squeare Error GAN loss. Defaults to True.\n        use_hinge_gan_loss (bool):\n            enable / disable using Hinge GAN loss. You should choose either Hinge or MSE loss for training GAN models.\n            Defaults to False.\n        use_feat_match_loss (bool):\n            enable / disable using Feature Matching loss originally used by MelGAN model. Defaults to True.\n        use_l1_spec_loss (bool):\n            enable / disable using L1 spectrogram loss originally used by HifiGAN model. Defaults to False.\n        stft_loss_params (dict):\n            STFT loss parameters. Default to\n            `{\n                \"n_ffts\": [1024, 2048, 512],\n                \"hop_lengths\": [120, 240, 50],\n                \"win_lengths\": [600, 1200, 240]\n            }`\n        l1_spec_loss_params (dict):\n            L1 spectrogram loss parameters. Default to\n            `{\n                \"use_mel\": True,\n                \"sample_rate\": 22050,\n                \"n_fft\": 1024,\n                \"hop_length\": 256,\n                \"win_length\": 1024,\n                \"n_mels\": 80,\n                \"mel_fmin\": 0.0,\n                \"mel_fmax\": None,\n            }`\n        stft_loss_weight (float): STFT loss weight that multiplies the computed loss before summing up the total\n            model loss. Defaults to 0.5.\n        subband_stft_loss_weight (float):\n            Subband STFT loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n        mse_G_loss_weight (float):\n            MSE generator loss weight that multiplies the computed loss before summing up the total loss. faults to 2.5.\n        hinge_G_loss_weight (float):\n            Hinge generator loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n        feat_match_loss_weight (float):\n            Feature matching loss weight that multiplies the computed loss before summing up the total loss. faults to 108.\n        l1_spec_loss_weight (float):\n            L1 spectrogram loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n    \"\"\"\n\n    model: str = \"hifigan\"\n    # model specific params\n    discriminator_model: str = \"hifigan_discriminator\"\n    generator_model: str = \"hifigan_generator\"\n    generator_model_params: dict = field(\n        default_factory=lambda: {\n            \"upsample_factors\": [8, 8, 2, 2],\n            \"upsample_kernel_sizes\": [16, 16, 4, 4],\n            \"upsample_initial_channel\": 512,\n            \"resblock_kernel_sizes\": [3, 7, 11],\n            \"resblock_dilation_sizes\": [[1, 3, 5], [1, 3, 5], [1, 3, 5]],\n            \"resblock_type\": \"1\",\n        }\n    )\n\n    # LOSS PARAMETERS - overrides\n    use_stft_loss: bool = False\n    use_subband_stft_loss: bool = False\n    use_mse_gan_loss: bool = True\n    use_hinge_gan_loss: bool = False\n    use_feat_match_loss: bool = True  # requires MelGAN Discriminators (MelGAN and HifiGAN)\n    use_l1_spec_loss: bool = True\n\n    # loss weights - overrides\n    stft_loss_weight: float = 0\n    subband_stft_loss_weight: float = 0\n    mse_G_loss_weight: float = 1\n    hinge_G_loss_weight: float = 0\n    feat_match_loss_weight: float = 108\n    l1_spec_loss_weight: float = 45\n    l1_spec_loss_params: dict = field(\n        default_factory=lambda: {\n            \"use_mel\": True,\n            \"sample_rate\": 22050,\n            \"n_fft\": 1024,\n            \"hop_length\": 256,\n            \"win_length\": 1024,\n            \"n_mels\": 80,\n            \"mel_fmin\": 0.0,\n            \"mel_fmax\": None,\n        }\n    )\n\n    # optimizer parameters\n    lr: float = 1e-4\n    wd: float = 1e-6\n"
  },
  {
    "path": "TTS/vocoder/configs/melgan_config.py",
    "content": "from dataclasses import dataclass, field\n\nfrom TTS.vocoder.configs.shared_configs import BaseGANVocoderConfig\n\n\n@dataclass\nclass MelganConfig(BaseGANVocoderConfig):\n    \"\"\"Defines parameters for MelGAN vocoder.\n\n    Example:\n\n        >>> from TTS.vocoder.configs import MelganConfig\n        >>> config = MelganConfig()\n\n    Args:\n        model (str):\n            Model name used for selecting the right model at initialization. Defaults to `melgan`.\n        discriminator_model (str): One of the discriminators from `TTS.vocoder.models.*_discriminator`. Defaults to\n            'melgan_multiscale_discriminator`.\n        discriminator_model_params (dict): The discriminator model parameters. Defaults to\n            '{\"base_channels\": 16, \"max_channels\": 1024, \"downsample_factors\": [4, 4, 4, 4]}`\n        generator_model (str): One of the generators from TTS.vocoder.models.*`. Every other non-GAN vocoder model is\n            considered as a generator too. Defaults to `melgan_generator`.\n        batch_size (int):\n            Batch size used at training. Larger values use more memory. Defaults to 16.\n        seq_len (int):\n            Audio segment length used at training. Larger values use more memory. Defaults to 8192.\n        pad_short (int):\n            Additional padding applied to the audio samples shorter than `seq_len`. Defaults to 0.\n        use_noise_augment (bool):\n            enable / disable random noise added to the input waveform. The noise is added after computing the\n            features. Defaults to True.\n        use_cache (bool):\n            enable / disable in memory caching of the computed features. It can cause OOM error if the system RAM is\n            not large enough. Defaults to True.\n        use_stft_loss (bool):\n            enable / disable use of STFT loss originally used by ParallelWaveGAN model. Defaults to True.\n        use_subband_stft (bool):\n            enable / disable use of subband loss computation originally used by MultiBandMelgan model. Defaults to True.\n        use_mse_gan_loss (bool):\n            enable / disable using Mean Squeare Error GAN loss. Defaults to True.\n        use_hinge_gan_loss (bool):\n            enable / disable using Hinge GAN loss. You should choose either Hinge or MSE loss for training GAN models.\n            Defaults to False.\n        use_feat_match_loss (bool):\n            enable / disable using Feature Matching loss originally used by MelGAN model. Defaults to True.\n        use_l1_spec_loss (bool):\n            enable / disable using L1 spectrogram loss originally used by HifiGAN model. Defaults to False.\n        stft_loss_params (dict): STFT loss parameters. Default to\n        `{\"n_ffts\": [1024, 2048, 512], \"hop_lengths\": [120, 240, 50], \"win_lengths\": [600, 1200, 240]}`\n        stft_loss_weight (float): STFT loss weight that multiplies the computed loss before summing up the total\n            model loss. Defaults to 0.5.\n        subband_stft_loss_weight (float):\n            Subband STFT loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n        mse_G_loss_weight (float):\n            MSE generator loss weight that multiplies the computed loss before summing up the total loss. faults to 2.5.\n        hinge_G_loss_weight (float):\n            Hinge generator loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n        feat_match_loss_weight (float):\n            Feature matching loss weight that multiplies the computed loss before summing up the total loss. faults to 108.\n        l1_spec_loss_weight (float):\n            L1 spectrogram loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n    \"\"\"\n\n    model: str = \"melgan\"\n\n    # Model specific params\n    discriminator_model: str = \"melgan_multiscale_discriminator\"\n    discriminator_model_params: dict = field(\n        default_factory=lambda: {\"base_channels\": 16, \"max_channels\": 1024, \"downsample_factors\": [4, 4, 4, 4]}\n    )\n    generator_model: str = \"melgan_generator\"\n    generator_model_params: dict = field(\n        default_factory=lambda: {\"upsample_factors\": [8, 8, 2, 2], \"num_res_blocks\": 3}\n    )\n\n    # Training - overrides\n    batch_size: int = 16\n    seq_len: int = 8192\n    pad_short: int = 2000\n    use_noise_augment: bool = True\n    use_cache: bool = True\n\n    # LOSS PARAMETERS - overrides\n    use_stft_loss: bool = True\n    use_subband_stft_loss: bool = False\n    use_mse_gan_loss: bool = True\n    use_hinge_gan_loss: bool = False\n    use_feat_match_loss: bool = True  # requires MelGAN Discriminators (MelGAN and HifiGAN)\n    use_l1_spec_loss: bool = False\n\n    stft_loss_params: dict = field(\n        default_factory=lambda: {\n            \"n_ffts\": [1024, 2048, 512],\n            \"hop_lengths\": [120, 240, 50],\n            \"win_lengths\": [600, 1200, 240],\n        }\n    )\n\n    # loss weights - overrides\n    stft_loss_weight: float = 0.5\n    subband_stft_loss_weight: float = 0\n    mse_G_loss_weight: float = 2.5\n    hinge_G_loss_weight: float = 0\n    feat_match_loss_weight: float = 108\n    l1_spec_loss_weight: float = 0\n"
  },
  {
    "path": "TTS/vocoder/configs/multiband_melgan_config.py",
    "content": "from dataclasses import dataclass, field\n\nfrom TTS.vocoder.configs.shared_configs import BaseGANVocoderConfig\n\n\n@dataclass\nclass MultibandMelganConfig(BaseGANVocoderConfig):\n    \"\"\"Defines parameters for MultiBandMelGAN vocoder.\n\n    Example:\n\n        >>> from TTS.vocoder.configs import MultibandMelganConfig\n        >>> config = MultibandMelganConfig()\n\n    Args:\n        model (str):\n            Model name used for selecting the right model at initialization. Defaults to `multiband_melgan`.\n        discriminator_model (str): One of the discriminators from `TTS.vocoder.models.*_discriminator`. Defaults to\n            'melgan_multiscale_discriminator`.\n        discriminator_model_params (dict): The discriminator model parameters. Defaults to\n            '{\n                \"base_channels\": 16,\n                \"max_channels\": 512,\n                \"downsample_factors\": [4, 4, 4]\n            }`\n        generator_model (str): One of the generators from TTS.vocoder.models.*`. Every other non-GAN vocoder model is\n            considered as a generator too. Defaults to `melgan_generator`.\n        generator_model_param (dict):\n            The generator model parameters. Defaults to `{\"upsample_factors\": [8, 4, 2], \"num_res_blocks\": 4}`.\n        use_pqmf (bool):\n            enable / disable PQMF modulation for multi-band training. Defaults to True.\n        lr_gen (float):\n            Initial learning rate for the generator model. Defaults to 0.0001.\n        lr_disc (float):\n            Initial learning rate for the discriminator model. Defaults to 0.0001.\n        optimizer (torch.optim.Optimizer):\n            Optimizer used for the training. Defaults to `AdamW`.\n        optimizer_params (dict):\n            Optimizer kwargs. Defaults to `{\"betas\": [0.8, 0.99], \"weight_decay\": 0.0}`\n        lr_scheduler_gen (torch.optim.Scheduler):\n            Learning rate scheduler for the generator. Defaults to `MultiStepLR`.\n        lr_scheduler_gen_params (dict):\n            Parameters for the generator learning rate scheduler. Defaults to\n            `{\"gamma\": 0.5, \"milestones\": [100000, 200000, 300000, 400000, 500000, 600000]}`.\n        lr_scheduler_disc (torch.optim.Scheduler):\n            Learning rate scheduler for the discriminator. Defaults to `MultiStepLR`.\n        lr_scheduler_dict_params (dict):\n            Parameters for the discriminator learning rate scheduler. Defaults to\n            `{\"gamma\": 0.5, \"milestones\": [100000, 200000, 300000, 400000, 500000, 600000]}`.\n        batch_size (int):\n            Batch size used at training. Larger values use more memory. Defaults to 16.\n        seq_len (int):\n            Audio segment length used at training. Larger values use more memory. Defaults to 8192.\n        pad_short (int):\n            Additional padding applied to the audio samples shorter than `seq_len`. Defaults to 0.\n        use_noise_augment (bool):\n            enable / disable random noise added to the input waveform. The noise is added after computing the\n            features. Defaults to True.\n        use_cache (bool):\n            enable / disable in memory caching of the computed features. It can cause OOM error if the system RAM is\n            not large enough. Defaults to True.\n        steps_to_start_discriminator (int):\n            Number of steps required to start training the discriminator. Defaults to 0.\n        use_stft_loss (bool):`\n            enable / disable use of STFT loss originally used by ParallelWaveGAN model. Defaults to True.\n        use_subband_stft (bool):\n            enable / disable use of subband loss computation originally used by MultiBandMelgan model. Defaults to True.\n        use_mse_gan_loss (bool):\n            enable / disable using Mean Squeare Error GAN loss. Defaults to True.\n        use_hinge_gan_loss (bool):\n            enable / disable using Hinge GAN loss. You should choose either Hinge or MSE loss for training GAN models.\n            Defaults to False.\n        use_feat_match_loss (bool):\n            enable / disable using Feature Matching loss originally used by MelGAN model. Defaults to True.\n        use_l1_spec_loss (bool):\n            enable / disable using L1 spectrogram loss originally used by HifiGAN model. Defaults to False.\n        stft_loss_params (dict): STFT loss parameters. Default to\n            `{\"n_ffts\": [1024, 2048, 512], \"hop_lengths\": [120, 240, 50], \"win_lengths\": [600, 1200, 240]}`\n        stft_loss_weight (float): STFT loss weight that multiplies the computed loss before summing up the total\n            model loss. Defaults to 0.5.\n        subband_stft_loss_weight (float):\n            Subband STFT loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n        mse_G_loss_weight (float):\n            MSE generator loss weight that multiplies the computed loss before summing up the total loss. faults to 2.5.\n        hinge_G_loss_weight (float):\n            Hinge generator loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n        feat_match_loss_weight (float):\n            Feature matching loss weight that multiplies the computed loss before summing up the total loss. faults to 108.\n        l1_spec_loss_weight (float):\n            L1 spectrogram loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n    \"\"\"\n\n    model: str = \"multiband_melgan\"\n\n    # Model specific params\n    discriminator_model: str = \"melgan_multiscale_discriminator\"\n    discriminator_model_params: dict = field(\n        default_factory=lambda: {\"base_channels\": 16, \"max_channels\": 512, \"downsample_factors\": [4, 4, 4]}\n    )\n    generator_model: str = \"multiband_melgan_generator\"\n    generator_model_params: dict = field(default_factory=lambda: {\"upsample_factors\": [8, 4, 2], \"num_res_blocks\": 4})\n    use_pqmf: bool = True\n\n    # optimizer - overrides\n    lr_gen: float = 0.0001  # Initial learning rate.\n    lr_disc: float = 0.0001  # Initial learning rate.\n    optimizer: str = \"AdamW\"\n    optimizer_params: dict = field(default_factory=lambda: {\"betas\": [0.8, 0.99], \"weight_decay\": 0.0})\n    lr_scheduler_gen: str = \"MultiStepLR\"  # one of the schedulers from https:#pytorch.org/docs/stable/optim.html\n    lr_scheduler_gen_params: dict = field(\n        default_factory=lambda: {\"gamma\": 0.5, \"milestones\": [100000, 200000, 300000, 400000, 500000, 600000]}\n    )\n    lr_scheduler_disc: str = \"MultiStepLR\"  # one of the schedulers from https:#pytorch.org/docs/stable/optim.html\n    lr_scheduler_disc_params: dict = field(\n        default_factory=lambda: {\"gamma\": 0.5, \"milestones\": [100000, 200000, 300000, 400000, 500000, 600000]}\n    )\n\n    # Training - overrides\n    batch_size: int = 64\n    seq_len: int = 16384\n    pad_short: int = 2000\n    use_noise_augment: bool = False\n    use_cache: bool = True\n    steps_to_start_discriminator: bool = 200000\n\n    # LOSS PARAMETERS - overrides\n    use_stft_loss: bool = True\n    use_subband_stft_loss: bool = True\n    use_mse_gan_loss: bool = True\n    use_hinge_gan_loss: bool = False\n    use_feat_match_loss: bool = False  # requires MelGAN Discriminators (MelGAN and HifiGAN)\n    use_l1_spec_loss: bool = False\n\n    subband_stft_loss_params: dict = field(\n        default_factory=lambda: {\"n_ffts\": [384, 683, 171], \"hop_lengths\": [30, 60, 10], \"win_lengths\": [150, 300, 60]}\n    )\n\n    # loss weights - overrides\n    stft_loss_weight: float = 0.5\n    subband_stft_loss_weight: float = 0\n    mse_G_loss_weight: float = 2.5\n    hinge_G_loss_weight: float = 0\n    feat_match_loss_weight: float = 108\n    l1_spec_loss_weight: float = 0\n"
  },
  {
    "path": "TTS/vocoder/configs/parallel_wavegan_config.py",
    "content": "from dataclasses import dataclass, field\n\nfrom .shared_configs import BaseGANVocoderConfig\n\n\n@dataclass\nclass ParallelWaveganConfig(BaseGANVocoderConfig):\n    \"\"\"Defines parameters for ParallelWavegan vocoder.\n\n    Args:\n        model (str):\n            Model name used for selecting the right configuration at initialization. Defaults to `gan`.\n        discriminator_model (str): One of the discriminators from `TTS.vocoder.models.*_discriminator`. Defaults to\n            'parallel_wavegan_discriminator`.\n        discriminator_model_params (dict): The discriminator model kwargs. Defaults to\n            '{\"num_layers\": 10}`\n        generator_model (str): One of the generators from TTS.vocoder.models.*`. Every other non-GAN vocoder model is\n            considered as a generator too. Defaults to `parallel_wavegan_generator`.\n        generator_model_param (dict):\n            The generator model kwargs. Defaults to `{\"upsample_factors\": [4, 4, 4, 4], \"stacks\": 3, \"num_res_blocks\": 30}`.\n        batch_size (int):\n            Batch size used at training. Larger values use more memory. Defaults to 16.\n        seq_len (int):\n            Audio segment length used at training. Larger values use more memory. Defaults to 8192.\n        pad_short (int):\n            Additional padding applied to the audio samples shorter than `seq_len`. Defaults to 0.\n        use_noise_augment (bool):\n            enable / disable random noise added to the input waveform. The noise is added after computing the\n            features. Defaults to True.\n        use_cache (bool):\n            enable / disable in memory caching of the computed features. It can cause OOM error if the system RAM is\n            not large enough. Defaults to True.\n        steps_to_start_discriminator (int):\n            Number of steps required to start training the discriminator. Defaults to 0.\n        use_stft_loss (bool):`\n            enable / disable use of STFT loss originally used by ParallelWaveGAN model. Defaults to True.\n        use_subband_stft (bool):\n            enable / disable use of subband loss computation originally used by MultiBandMelgan model. Defaults to True.\n        use_mse_gan_loss (bool):\n            enable / disable using Mean Squeare Error GAN loss. Defaults to True.\n        use_hinge_gan_loss (bool):\n            enable / disable using Hinge GAN loss. You should choose either Hinge or MSE loss for training GAN models.\n            Defaults to False.\n        use_feat_match_loss (bool):\n            enable / disable using Feature Matching loss originally used by MelGAN model. Defaults to True.\n        use_l1_spec_loss (bool):\n            enable / disable using L1 spectrogram loss originally used by HifiGAN model. Defaults to False.\n        stft_loss_params (dict): STFT loss parameters. Default to\n            `{\"n_ffts\": [1024, 2048, 512], \"hop_lengths\": [120, 240, 50], \"win_lengths\": [600, 1200, 240]}`\n        stft_loss_weight (float): STFT loss weight that multiplies the computed loss before summing up the total\n            model loss. Defaults to 0.5.\n        subband_stft_loss_weight (float):\n            Subband STFT loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n        mse_G_loss_weight (float):\n            MSE generator loss weight that multiplies the computed loss before summing up the total loss. faults to 2.5.\n        hinge_G_loss_weight (float):\n            Hinge generator loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n        feat_match_loss_weight (float):\n            Feature matching loss weight that multiplies the computed loss before summing up the total loss. faults to 0.\n        l1_spec_loss_weight (float):\n            L1 spectrogram loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n        lr_gen (float):\n            Generator model initial learning rate. Defaults to 0.0002.\n        lr_disc (float):\n            Discriminator model initial learning rate. Defaults to 0.0002.\n        optimizer (torch.optim.Optimizer):\n            Optimizer used for the training. Defaults to `AdamW`.\n        optimizer_params (dict):\n            Optimizer kwargs. Defaults to `{\"betas\": [0.8, 0.99], \"weight_decay\": 0.0}`\n        lr_scheduler_gen (torch.optim.Scheduler):\n            Learning rate scheduler for the generator. Defaults to `ExponentialLR`.\n        lr_scheduler_gen_params (dict):\n            Parameters for the generator learning rate scheduler. Defaults to `{\"gamma\": 0.5, \"step_size\": 200000, \"last_epoch\": -1}`.\n        lr_scheduler_disc (torch.optim.Scheduler):\n            Learning rate scheduler for the discriminator. Defaults to `ExponentialLR`.\n        lr_scheduler_dict_params (dict):\n            Parameters for the discriminator learning rate scheduler. Defaults to `{\"gamma\": 0.5, \"step_size\": 200000, \"last_epoch\": -1}`.\n    \"\"\"\n\n    model: str = \"parallel_wavegan\"\n\n    # Model specific params\n    discriminator_model: str = \"parallel_wavegan_discriminator\"\n    discriminator_model_params: dict = field(default_factory=lambda: {\"num_layers\": 10})\n    generator_model: str = \"parallel_wavegan_generator\"\n    generator_model_params: dict = field(\n        default_factory=lambda: {\"upsample_factors\": [4, 4, 4, 4], \"stacks\": 3, \"num_res_blocks\": 30}\n    )\n\n    # Training - overrides\n    batch_size: int = 6\n    seq_len: int = 25600\n    pad_short: int = 2000\n    use_noise_augment: bool = False\n    use_cache: bool = True\n    steps_to_start_discriminator: int = 200000\n\n    # LOSS PARAMETERS - overrides\n    use_stft_loss: bool = True\n    use_subband_stft_loss: bool = False\n    use_mse_gan_loss: bool = True\n    use_hinge_gan_loss: bool = False\n    use_feat_match_loss: bool = False  # requires MelGAN Discriminators (MelGAN and HifiGAN)\n    use_l1_spec_loss: bool = False\n\n    stft_loss_params: dict = field(\n        default_factory=lambda: {\n            \"n_ffts\": [1024, 2048, 512],\n            \"hop_lengths\": [120, 240, 50],\n            \"win_lengths\": [600, 1200, 240],\n        }\n    )\n\n    # loss weights - overrides\n    stft_loss_weight: float = 0.5\n    subband_stft_loss_weight: float = 0\n    mse_G_loss_weight: float = 2.5\n    hinge_G_loss_weight: float = 0\n    feat_match_loss_weight: float = 0\n    l1_spec_loss_weight: float = 0\n\n    # optimizer overrides\n    lr_gen: float = 0.0002  # Initial learning rate.\n    lr_disc: float = 0.0002  # Initial learning rate.\n    optimizer: str = \"AdamW\"\n    optimizer_params: dict = field(default_factory=lambda: {\"betas\": [0.8, 0.99], \"weight_decay\": 0.0})\n    lr_scheduler_gen: str = \"StepLR\"  # one of the schedulers from https:#pytorch.org/docs/stable/optim.html\n    lr_scheduler_gen_params: dict = field(default_factory=lambda: {\"gamma\": 0.5, \"step_size\": 200000, \"last_epoch\": -1})\n    lr_scheduler_disc: str = \"StepLR\"  # one of the schedulers from https:#pytorch.org/docs/stable/optim.html\n    lr_scheduler_disc_params: dict = field(\n        default_factory=lambda: {\"gamma\": 0.5, \"step_size\": 200000, \"last_epoch\": -1}\n    )\n    scheduler_after_epoch: bool = False\n"
  },
  {
    "path": "TTS/vocoder/configs/shared_configs.py",
    "content": "from dataclasses import dataclass, field\n\nfrom TTS.config import BaseAudioConfig, BaseTrainingConfig\n\n\n@dataclass\nclass BaseVocoderConfig(BaseTrainingConfig):\n    \"\"\"Shared parameters among all the vocoder models.\n    Args:\n        audio (BaseAudioConfig):\n            Audio processor config instance. Defaultsto `BaseAudioConfig()`.\n        use_noise_augment (bool):\n            Augment the input audio with random noise. Defaults to False/\n        eval_split_size (int):\n            Number of instances used for evaluation. Defaults to 10.\n        data_path (str):\n            Root path of the training data. All the audio files found recursively from this root path are used for\n            training. Defaults to `\"\"`.\n        feature_path (str):\n            Root path to the precomputed feature files. Defaults to None.\n        seq_len (int):\n            Length of the waveform segments used for training. Defaults to 1000.\n        pad_short (int):\n            Extra padding for the waveforms shorter than `seq_len`. Defaults to 0.\n        conv_path (int):\n            Extra padding for the feature frames against convolution of the edge frames. Defaults to MISSING.\n            Defaults to 0.\n        use_cache (bool):\n            enable / disable in memory caching of the computed features. If the RAM is not enough, if may cause OOM.\n            Defaults to False.\n        epochs (int):\n            Number of training epochs to. Defaults to 10000.\n        wd (float):\n            Weight decay.\n         optimizer (torch.optim.Optimizer):\n            Optimizer used for the training. Defaults to `AdamW`.\n        optimizer_params (dict):\n            Optimizer kwargs. Defaults to `{\"betas\": [0.8, 0.99], \"weight_decay\": 0.0}`\n    \"\"\"\n\n    audio: BaseAudioConfig = field(default_factory=BaseAudioConfig)\n    # dataloading\n    use_noise_augment: bool = False  # enable/disable random noise augmentation in spectrograms.\n    eval_split_size: int = 10  # number of samples used for evaluation.\n    # dataset\n    data_path: str = \"\"  # root data path. It finds all wav files recursively from there.\n    feature_path: str = None  # if you use precomputed features\n    seq_len: int = 1000  # signal length used in training.\n    pad_short: int = 0  # additional padding for short wavs\n    conv_pad: int = 0  # additional padding against convolutions applied to spectrograms\n    use_cache: bool = False  # use in memory cache to keep the computed features. This might cause OOM.\n    # OPTIMIZER\n    epochs: int = 10000  # total number of epochs to train.\n    wd: float = 0.0  # Weight decay weight.\n    optimizer: str = \"AdamW\"\n    optimizer_params: dict = field(default_factory=lambda: {\"betas\": [0.8, 0.99], \"weight_decay\": 0.0})\n\n\n@dataclass\nclass BaseGANVocoderConfig(BaseVocoderConfig):\n    \"\"\"Base config class used among all the GAN based vocoders.\n    Args:\n        use_stft_loss (bool):\n            enable / disable the use of STFT loss. Defaults to True.\n        use_subband_stft_loss (bool):\n            enable / disable the use of Subband STFT loss. Defaults to True.\n        use_mse_gan_loss (bool):\n            enable / disable the use of Mean Squared Error based GAN loss. Defaults to True.\n        use_hinge_gan_loss (bool):\n            enable / disable the use of Hinge GAN loss. Defaults to True.\n        use_feat_match_loss (bool):\n            enable / disable feature matching loss. Defaults to True.\n        use_l1_spec_loss (bool):\n            enable / disable L1 spectrogram loss. Defaults to True.\n        stft_loss_weight (float):\n            Loss weight that multiplies the computed loss value. Defaults to 0.\n        subband_stft_loss_weight (float):\n            Loss weight that multiplies the computed loss value. Defaults to 0.\n        mse_G_loss_weight (float):\n            Loss weight that multiplies the computed loss value. Defaults to 1.\n        hinge_G_loss_weight (float):\n            Loss weight that multiplies the computed loss value. Defaults to 0.\n        feat_match_loss_weight (float):\n            Loss weight that multiplies the computed loss value. Defaults to 100.\n        l1_spec_loss_weight (float):\n            Loss weight that multiplies the computed loss value. Defaults to 45.\n        stft_loss_params (dict):\n            Parameters for the STFT loss. Defaults to `{\"n_ffts\": [1024, 2048, 512], \"hop_lengths\": [120, 240, 50], \"win_lengths\": [600, 1200, 240]}`.\n        l1_spec_loss_params (dict):\n            Parameters for the L1 spectrogram loss. Defaults to\n            `{\n                \"use_mel\": True,\n                \"sample_rate\": 22050,\n                \"n_fft\": 1024,\n                \"hop_length\": 256,\n                \"win_length\": 1024,\n                \"n_mels\": 80,\n                \"mel_fmin\": 0.0,\n                \"mel_fmax\": None,\n            }`\n        target_loss (str):\n            Target loss name that defines the quality of the model. Defaults to `G_avg_loss`.\n        grad_clip (list):\n            A list of gradient clipping theresholds for each optimizer. Any value less than 0 disables clipping.\n            Defaults to [5, 5].\n        lr_gen (float):\n            Generator model initial learning rate. Defaults to 0.0002.\n        lr_disc (float):\n            Discriminator model initial learning rate. Defaults to 0.0002.\n        lr_scheduler_gen (torch.optim.Scheduler):\n            Learning rate scheduler for the generator. Defaults to `ExponentialLR`.\n        lr_scheduler_gen_params (dict):\n            Parameters for the generator learning rate scheduler. Defaults to `{\"gamma\": 0.999, \"last_epoch\": -1}`.\n        lr_scheduler_disc (torch.optim.Scheduler):\n            Learning rate scheduler for the discriminator. Defaults to `ExponentialLR`.\n        lr_scheduler_disc_params (dict):\n            Parameters for the discriminator learning rate scheduler. Defaults to `{\"gamma\": 0.999, \"last_epoch\": -1}`.\n        scheduler_after_epoch (bool):\n            Whether to update the learning rate schedulers after each epoch. Defaults to True.\n        use_pqmf (bool):\n            enable / disable PQMF for subband approximation at training. Defaults to False.\n        steps_to_start_discriminator (int):\n            Number of steps required to start training the discriminator. Defaults to 0.\n        diff_samples_for_G_and_D (bool):\n            enable / disable use of different training samples for the generator and the discriminator iterations.\n            Enabling it results in slower iterations but faster convergance in some cases. Defaults to False.\n    \"\"\"\n\n    model: str = \"gan\"\n\n    # LOSS PARAMETERS\n    use_stft_loss: bool = True\n    use_subband_stft_loss: bool = True\n    use_mse_gan_loss: bool = True\n    use_hinge_gan_loss: bool = True\n    use_feat_match_loss: bool = True  # requires MelGAN Discriminators (MelGAN and HifiGAN)\n    use_l1_spec_loss: bool = True\n\n    # loss weights\n    stft_loss_weight: float = 0\n    subband_stft_loss_weight: float = 0\n    mse_G_loss_weight: float = 1\n    hinge_G_loss_weight: float = 0\n    feat_match_loss_weight: float = 100\n    l1_spec_loss_weight: float = 45\n\n    stft_loss_params: dict = field(\n        default_factory=lambda: {\n            \"n_ffts\": [1024, 2048, 512],\n            \"hop_lengths\": [120, 240, 50],\n            \"win_lengths\": [600, 1200, 240],\n        }\n    )\n\n    l1_spec_loss_params: dict = field(\n        default_factory=lambda: {\n            \"use_mel\": True,\n            \"sample_rate\": 22050,\n            \"n_fft\": 1024,\n            \"hop_length\": 256,\n            \"win_length\": 1024,\n            \"n_mels\": 80,\n            \"mel_fmin\": 0.0,\n            \"mel_fmax\": None,\n        }\n    )\n\n    target_loss: str = \"loss_0\"  # loss value to pick the best model to save after each epoch\n\n    # optimizer\n    grad_clip: float = field(default_factory=lambda: [5, 5])\n    lr_gen: float = 0.0002  # Initial learning rate.\n    lr_disc: float = 0.0002  # Initial learning rate.\n    lr_scheduler_gen: str = \"ExponentialLR\"  # one of the schedulers from https:#pytorch.org/docs/stable/optim.html\n    lr_scheduler_gen_params: dict = field(default_factory=lambda: {\"gamma\": 0.999, \"last_epoch\": -1})\n    lr_scheduler_disc: str = \"ExponentialLR\"  # one of the schedulers from https:#pytorch.org/docs/stable/optim.html\n    lr_scheduler_disc_params: dict = field(default_factory=lambda: {\"gamma\": 0.999, \"last_epoch\": -1})\n    scheduler_after_epoch: bool = True\n\n    use_pqmf: bool = False  # enable/disable using pqmf for multi-band training. (Multi-band MelGAN)\n    steps_to_start_discriminator = 0  # start training the discriminator after this number of steps.\n    diff_samples_for_G_and_D: bool = False  # use different samples for G and D training steps.\n"
  },
  {
    "path": "TTS/vocoder/configs/univnet_config.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import Dict\n\nfrom TTS.vocoder.configs.shared_configs import BaseGANVocoderConfig\n\n\n@dataclass\nclass UnivnetConfig(BaseGANVocoderConfig):\n    \"\"\"Defines parameters for UnivNet vocoder.\n\n    Example:\n\n        >>> from TTS.vocoder.configs import UnivNetConfig\n        >>> config = UnivNetConfig()\n\n    Args:\n        model (str):\n            Model name used for selecting the right model at initialization. Defaults to `UnivNet`.\n        discriminator_model (str): One of the discriminators from `TTS.vocoder.models.*_discriminator`. Defaults to\n            'UnivNet_discriminator`.\n        generator_model (str): One of the generators from TTS.vocoder.models.*`. Every other non-GAN vocoder model is\n            considered as a generator too. Defaults to `UnivNet_generator`.\n        generator_model_params (dict): Parameters of the generator model. Defaults to\n            `\n            {\n                \"use_mel\": True,\n                \"sample_rate\": 22050,\n                \"n_fft\": 1024,\n                \"hop_length\": 256,\n                \"win_length\": 1024,\n                \"n_mels\": 80,\n                \"mel_fmin\": 0.0,\n                \"mel_fmax\": None,\n            }\n            `\n        batch_size (int):\n            Batch size used at training. Larger values use more memory. Defaults to 32.\n        seq_len (int):\n            Audio segment length used at training. Larger values use more memory. Defaults to 8192.\n        pad_short (int):\n            Additional padding applied to the audio samples shorter than `seq_len`. Defaults to 0.\n        use_noise_augment (bool):\n            enable / disable random noise added to the input waveform. The noise is added after computing the\n            features. Defaults to True.\n        use_cache (bool):\n            enable / disable in memory caching of the computed features. It can cause OOM error if the system RAM is\n            not large enough. Defaults to True.\n        use_stft_loss (bool):\n            enable / disable use of STFT loss originally used by ParallelWaveGAN model. Defaults to True.\n        use_subband_stft (bool):\n            enable / disable use of subband loss computation originally used by MultiBandMelgan model. Defaults to True.\n        use_mse_gan_loss (bool):\n            enable / disable using Mean Squeare Error GAN loss. Defaults to True.\n        use_hinge_gan_loss (bool):\n            enable / disable using Hinge GAN loss. You should choose either Hinge or MSE loss for training GAN models.\n            Defaults to False.\n        use_feat_match_loss (bool):\n            enable / disable using Feature Matching loss originally used by MelGAN model. Defaults to True.\n        use_l1_spec_loss (bool):\n            enable / disable using L1 spectrogram loss originally used by univnet model. Defaults to False.\n        stft_loss_params (dict):\n            STFT loss parameters. Default to\n            `{\n                \"n_ffts\": [1024, 2048, 512],\n                \"hop_lengths\": [120, 240, 50],\n                \"win_lengths\": [600, 1200, 240]\n            }`\n        l1_spec_loss_params (dict):\n            L1 spectrogram loss parameters. Default to\n            `{\n                \"use_mel\": True,\n                \"sample_rate\": 22050,\n                \"n_fft\": 1024,\n                \"hop_length\": 256,\n                \"win_length\": 1024,\n                \"n_mels\": 80,\n                \"mel_fmin\": 0.0,\n                \"mel_fmax\": None,\n            }`\n        stft_loss_weight (float): STFT loss weight that multiplies the computed loss before summing up the total\n            model loss. Defaults to 0.5.\n        subband_stft_loss_weight (float):\n            Subband STFT loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n        mse_G_loss_weight (float):\n            MSE generator loss weight that multiplies the computed loss before summing up the total loss. faults to 2.5.\n        hinge_G_loss_weight (float):\n            Hinge generator loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n        feat_match_loss_weight (float):\n            Feature matching loss weight that multiplies the computed loss before summing up the total loss. faults to 108.\n        l1_spec_loss_weight (float):\n            L1 spectrogram loss weight that multiplies the computed loss before summing up the total loss. Defaults to 0.\n    \"\"\"\n\n    model: str = \"univnet\"\n    batch_size: int = 32\n    # model specific params\n    discriminator_model: str = \"univnet_discriminator\"\n    generator_model: str = \"univnet_generator\"\n    generator_model_params: Dict = field(\n        default_factory=lambda: {\n            \"in_channels\": 64,\n            \"out_channels\": 1,\n            \"hidden_channels\": 32,\n            \"cond_channels\": 80,\n            \"upsample_factors\": [8, 8, 4],\n            \"lvc_layers_each_block\": 4,\n            \"lvc_kernel_size\": 3,\n            \"kpnet_hidden_channels\": 64,\n            \"kpnet_conv_size\": 3,\n            \"dropout\": 0.0,\n        }\n    )\n\n    # LOSS PARAMETERS - overrides\n    use_stft_loss: bool = True\n    use_subband_stft_loss: bool = False\n    use_mse_gan_loss: bool = True\n    use_hinge_gan_loss: bool = False\n    use_feat_match_loss: bool = False  # requires MelGAN Discriminators (MelGAN and univnet)\n    use_l1_spec_loss: bool = False\n\n    # loss weights - overrides\n    stft_loss_weight: float = 2.5\n    stft_loss_params: Dict = field(\n        default_factory=lambda: {\n            \"n_ffts\": [1024, 2048, 512],\n            \"hop_lengths\": [120, 240, 50],\n            \"win_lengths\": [600, 1200, 240],\n        }\n    )\n    subband_stft_loss_weight: float = 0\n    mse_G_loss_weight: float = 1\n    hinge_G_loss_weight: float = 0\n    feat_match_loss_weight: float = 0\n    l1_spec_loss_weight: float = 0\n    l1_spec_loss_params: Dict = field(\n        default_factory=lambda: {\n            \"use_mel\": True,\n            \"sample_rate\": 22050,\n            \"n_fft\": 1024,\n            \"hop_length\": 256,\n            \"win_length\": 1024,\n            \"n_mels\": 80,\n            \"mel_fmin\": 0.0,\n            \"mel_fmax\": None,\n        }\n    )\n\n    # optimizer parameters\n    lr_gen: float = 1e-4  # Initial learning rate.\n    lr_disc: float = 1e-4  # Initial learning rate.\n    lr_scheduler_gen: str = None  # one of the schedulers from https:#pytorch.org/docs/stable/optim.html\n    # lr_scheduler_gen_params: dict = field(default_factory=lambda: {\"gamma\": 0.999, \"last_epoch\": -1})\n    lr_scheduler_disc: str = None  # one of the schedulers from https:#pytorch.org/docs/stable/optim.html\n    # lr_scheduler_disc_params: dict = field(default_factory=lambda: {\"gamma\": 0.999, \"last_epoch\": -1})\n    optimizer_params: Dict = field(default_factory=lambda: {\"betas\": [0.5, 0.9], \"weight_decay\": 0.0})\n    steps_to_start_discriminator: int = 200000\n\n    def __post_init__(self):\n        super().__post_init__()\n        self.generator_model_params[\"cond_channels\"] = self.audio.num_mels\n"
  },
  {
    "path": "TTS/vocoder/configs/wavegrad_config.py",
    "content": "from dataclasses import dataclass, field\n\nfrom TTS.vocoder.configs.shared_configs import BaseVocoderConfig\nfrom TTS.vocoder.models.wavegrad import WavegradArgs\n\n\n@dataclass\nclass WavegradConfig(BaseVocoderConfig):\n    \"\"\"Defines parameters for WaveGrad vocoder.\n    Example:\n\n        >>> from TTS.vocoder.configs import WavegradConfig\n        >>> config = WavegradConfig()\n\n    Args:\n        model (str):\n            Model name used for selecting the right model at initialization. Defaults to `wavegrad`.\n        generator_model (str): One of the generators from TTS.vocoder.models.*`. Every other non-GAN vocoder model is\n            considered as a generator too. Defaults to `wavegrad`.\n        model_params (WavegradArgs): Model parameters. Check `WavegradArgs` for default values.\n        target_loss (str):\n            Target loss name that defines the quality of the model. Defaults to `avg_wavegrad_loss`.\n        epochs (int):\n            Number of epochs to traing the model. Defaults to 10000.\n        batch_size (int):\n            Batch size used at training. Larger values use more memory. Defaults to 96.\n        seq_len (int):\n            Audio segment length used at training. Larger values use more memory. Defaults to 6144.\n        use_cache (bool):\n            enable / disable in memory caching of the computed features. It can cause OOM error if the system RAM is\n            not large enough. Defaults to True.\n        mixed_precision (bool):\n            enable / disable mixed precision training. Default is True.\n        eval_split_size (int):\n            Number of samples used for evalutaion. Defaults to 50.\n        train_noise_schedule (dict):\n            Training noise schedule. Defaults to\n            `{\"min_val\": 1e-6, \"max_val\": 1e-2, \"num_steps\": 1000}`\n        test_noise_schedule (dict):\n            Inference noise schedule. For a better performance, you may need to use `bin/tune_wavegrad.py` to find a\n            better schedule. Defaults to\n            `\n            {\n                \"min_val\": 1e-6,\n                \"max_val\": 1e-2,\n                \"num_steps\": 50,\n            }\n            `\n        grad_clip (float):\n            Gradient clipping threshold. If <= 0.0, no clipping is applied. Defaults to 1.0\n        lr (float):\n            Initila leraning rate. Defaults to 1e-4.\n        lr_scheduler (str):\n            One of the learning rate schedulers from `torch.optim.scheduler.*`. Defaults to `MultiStepLR`.\n        lr_scheduler_params (dict):\n            kwargs for the scheduler. Defaults to `{\"gamma\": 0.5, \"milestones\": [100000, 200000, 300000, 400000, 500000, 600000]}`\n    \"\"\"\n\n    model: str = \"wavegrad\"\n    # Model specific params\n    generator_model: str = \"wavegrad\"\n    model_params: WavegradArgs = field(default_factory=WavegradArgs)\n    target_loss: str = \"loss\"  # loss value to pick the best model to save after each epoch\n\n    # Training - overrides\n    epochs: int = 10000\n    batch_size: int = 96\n    seq_len: int = 6144\n    use_cache: bool = True\n    mixed_precision: bool = True\n    eval_split_size: int = 50\n\n    # NOISE SCHEDULE PARAMS\n    train_noise_schedule: dict = field(default_factory=lambda: {\"min_val\": 1e-6, \"max_val\": 1e-2, \"num_steps\": 1000})\n\n    test_noise_schedule: dict = field(\n        default_factory=lambda: {  # inference noise schedule. Try TTS/bin/tune_wavegrad.py to find the optimal values.\n            \"min_val\": 1e-6,\n            \"max_val\": 1e-2,\n            \"num_steps\": 50,\n        }\n    )\n\n    # optimizer overrides\n    grad_clip: float = 1.0\n    lr: float = 1e-4  # Initial learning rate.\n    lr_scheduler: str = \"MultiStepLR\"  # one of the schedulers from https:#pytorch.org/docs/stable/optim.html\n    lr_scheduler_params: dict = field(\n        default_factory=lambda: {\"gamma\": 0.5, \"milestones\": [100000, 200000, 300000, 400000, 500000, 600000]}\n    )\n"
  },
  {
    "path": "TTS/vocoder/configs/wavernn_config.py",
    "content": "from dataclasses import dataclass, field\n\nfrom TTS.vocoder.configs.shared_configs import BaseVocoderConfig\nfrom TTS.vocoder.models.wavernn import WavernnArgs\n\n\n@dataclass\nclass WavernnConfig(BaseVocoderConfig):\n    \"\"\"Defines parameters for Wavernn vocoder.\n    Example:\n\n        >>> from TTS.vocoder.configs import WavernnConfig\n        >>> config = WavernnConfig()\n\n    Args:\n        model (str):\n            Model name used for selecting the right model at initialization. Defaults to `wavernn`.\n        mode (str):\n            Output mode of the WaveRNN vocoder. `mold` for Mixture of Logistic Distribution, `gauss` for a single\n            Gaussian Distribution and `bits` for quantized bits as the model's output.\n        mulaw (bool):\n            enable / disable the use of Mulaw quantization for training. Only applicable if `mode == 'bits'`. Defaults\n            to `True`.\n        generator_model (str):\n            One of the generators from TTS.vocoder.models.*`. Every other non-GAN vocoder model is\n            considered as a generator too. Defaults to `WaveRNN`.\n        wavernn_model_params (dict):\n            kwargs for the WaveRNN model. Defaults to\n            `{\n                \"rnn_dims\": 512,\n                \"fc_dims\": 512,\n                \"compute_dims\": 128,\n                \"res_out_dims\": 128,\n                \"num_res_blocks\": 10,\n                \"use_aux_net\": True,\n                \"use_upsample_net\": True,\n                \"upsample_factors\": [4, 8, 8]\n            }`\n        batched (bool):\n            enable / disable the batched inference. It speeds up the inference by splitting the input into segments and\n            processing the segments in a batch. Then it merges the outputs with a certain overlap and smoothing. If\n            you set it False, without CUDA, it is too slow to be practical. Defaults to True.\n        target_samples (int):\n            Size of the segments in batched mode. Defaults to 11000.\n        overlap_sampels (int):\n            Size of the overlap between consecutive segments. Defaults to 550.\n        batch_size (int):\n            Batch size used at training. Larger values use more memory. Defaults to 256.\n        seq_len (int):\n            Audio segment length used at training. Larger values use more memory. Defaults to 1280.\n\n        use_noise_augment (bool):\n            enable / disable random noise added to the input waveform. The noise is added after computing the\n            features. Defaults to True.\n        use_cache (bool):\n            enable / disable in memory caching of the computed features. It can cause OOM error if the system RAM is\n            not large enough. Defaults to True.\n        mixed_precision (bool):\n            enable / disable mixed precision training. Default is True.\n        eval_split_size (int):\n            Number of samples used for evalutaion. Defaults to 50.\n        num_epochs_before_test (int):\n            Number of epochs waited to run the next evalution. Since inference takes some time, it is better to\n            wait some number of epochs not ot waste training time. Defaults to 10.\n        grad_clip (float):\n            Gradient clipping threshold. If <= 0.0, no clipping is applied. Defaults to 4.0\n        lr (float):\n            Initila leraning rate. Defaults to 1e-4.\n        lr_scheduler (str):\n            One of the learning rate schedulers from `torch.optim.scheduler.*`. Defaults to `MultiStepLR`.\n        lr_scheduler_params (dict):\n            kwargs for the scheduler. Defaults to `{\"gamma\": 0.5, \"milestones\": [200000, 400000, 600000]}`\n    \"\"\"\n\n    model: str = \"wavernn\"\n\n    # Model specific params\n    model_args: WavernnArgs = field(default_factory=WavernnArgs)\n    target_loss: str = \"loss\"\n\n    # Inference\n    batched: bool = True\n    target_samples: int = 11000\n    overlap_samples: int = 550\n\n    # Training - overrides\n    epochs: int = 10000\n    batch_size: int = 256\n    seq_len: int = 1280\n    use_noise_augment: bool = False\n    use_cache: bool = True\n    mixed_precision: bool = True\n    eval_split_size: int = 50\n    num_epochs_before_test: int = (\n        10  # number of epochs to wait until the next test run (synthesizing a full audio clip).\n    )\n\n    # optimizer overrides\n    grad_clip: float = 4.0\n    lr: float = 1e-4  # Initial learning rate.\n    lr_scheduler: str = \"MultiStepLR\"  # one of the schedulers from https:#pytorch.org/docs/stable/optim.html\n    lr_scheduler_params: dict = field(default_factory=lambda: {\"gamma\": 0.5, \"milestones\": [200000, 400000, 600000]})\n"
  },
  {
    "path": "TTS/vocoder/datasets/__init__.py",
    "content": "from typing import List\n\nfrom coqpit import Coqpit\nfrom torch.utils.data import Dataset\n\nfrom TTS.utils.audio import AudioProcessor\nfrom TTS.vocoder.datasets.gan_dataset import GANDataset\nfrom TTS.vocoder.datasets.preprocess import load_wav_data, load_wav_feat_data\nfrom TTS.vocoder.datasets.wavegrad_dataset import WaveGradDataset\nfrom TTS.vocoder.datasets.wavernn_dataset import WaveRNNDataset\n\n\ndef setup_dataset(config: Coqpit, ap: AudioProcessor, is_eval: bool, data_items: List, verbose: bool) -> Dataset:\n    if config.model.lower() in \"gan\":\n        dataset = GANDataset(\n            ap=ap,\n            items=data_items,\n            seq_len=config.seq_len,\n            hop_len=ap.hop_length,\n            pad_short=config.pad_short,\n            conv_pad=config.conv_pad,\n            return_pairs=config.diff_samples_for_G_and_D if \"diff_samples_for_G_and_D\" in config else False,\n            is_training=not is_eval,\n            return_segments=not is_eval,\n            use_noise_augment=config.use_noise_augment,\n            use_cache=config.use_cache,\n            verbose=verbose,\n        )\n        dataset.shuffle_mapping()\n    elif config.model.lower() == \"wavegrad\":\n        dataset = WaveGradDataset(\n            ap=ap,\n            items=data_items,\n            seq_len=config.seq_len,\n            hop_len=ap.hop_length,\n            pad_short=config.pad_short,\n            conv_pad=config.conv_pad,\n            is_training=not is_eval,\n            return_segments=True,\n            use_noise_augment=False,\n            use_cache=config.use_cache,\n            verbose=verbose,\n        )\n    elif config.model.lower() == \"wavernn\":\n        dataset = WaveRNNDataset(\n            ap=ap,\n            items=data_items,\n            seq_len=config.seq_len,\n            hop_len=ap.hop_length,\n            pad=config.model_params.pad,\n            mode=config.model_params.mode,\n            mulaw=config.model_params.mulaw,\n            is_training=not is_eval,\n            verbose=verbose,\n        )\n    else:\n        raise ValueError(f\" [!] Dataset for model {config.model.lower()} cannot be found.\")\n    return dataset\n"
  },
  {
    "path": "TTS/vocoder/datasets/gan_dataset.py",
    "content": "import glob\nimport os\nimport random\nfrom multiprocessing import Manager\n\nimport numpy as np\nimport torch\nfrom torch.utils.data import Dataset\n\n\nclass GANDataset(Dataset):\n    \"\"\"\n    GAN Dataset searchs for all the wav files under root path\n    and converts them to acoustic features on the fly and returns\n    random segments of (audio, feature) couples.\n    \"\"\"\n\n    def __init__(\n        self,\n        ap,\n        items,\n        seq_len,\n        hop_len,\n        pad_short,\n        conv_pad=2,\n        return_pairs=False,\n        is_training=True,\n        return_segments=True,\n        use_noise_augment=False,\n        use_cache=False,\n        verbose=False,\n    ):\n        super().__init__()\n        self.ap = ap\n        self.item_list = items\n        self.compute_feat = not isinstance(items[0], (tuple, list))\n        self.seq_len = seq_len\n        self.hop_len = hop_len\n        self.pad_short = pad_short\n        self.conv_pad = conv_pad\n        self.return_pairs = return_pairs\n        self.is_training = is_training\n        self.return_segments = return_segments\n        self.use_cache = use_cache\n        self.use_noise_augment = use_noise_augment\n        self.verbose = verbose\n\n        assert seq_len % hop_len == 0, \" [!] seq_len has to be a multiple of hop_len.\"\n        self.feat_frame_len = seq_len // hop_len + (2 * conv_pad)\n\n        # map G and D instances\n        self.G_to_D_mappings = list(range(len(self.item_list)))\n        self.shuffle_mapping()\n\n        # cache acoustic features\n        if use_cache:\n            self.create_feature_cache()\n\n    def create_feature_cache(self):\n        self.manager = Manager()\n        self.cache = self.manager.list()\n        self.cache += [None for _ in range(len(self.item_list))]\n\n    @staticmethod\n    def find_wav_files(path):\n        return glob.glob(os.path.join(path, \"**\", \"*.wav\"), recursive=True)\n\n    def __len__(self):\n        return len(self.item_list)\n\n    def __getitem__(self, idx):\n        \"\"\"Return different items for Generator and Discriminator and\n        cache acoustic features\"\"\"\n\n        # set the seed differently for each worker\n        if torch.utils.data.get_worker_info():\n            random.seed(torch.utils.data.get_worker_info().seed)\n\n        if self.return_segments:\n            item1 = self.load_item(idx)\n            if self.return_pairs:\n                idx2 = self.G_to_D_mappings[idx]\n                item2 = self.load_item(idx2)\n                return item1, item2\n            return item1\n        item1 = self.load_item(idx)\n        return item1\n\n    def _pad_short_samples(self, audio, mel=None):\n        \"\"\"Pad samples shorter than the output sequence length\"\"\"\n        if len(audio) < self.seq_len:\n            audio = np.pad(audio, (0, self.seq_len - len(audio)), mode=\"constant\", constant_values=0.0)\n\n        if mel is not None and mel.shape[1] < self.feat_frame_len:\n            pad_value = self.ap.melspectrogram(np.zeros([self.ap.win_length]))[:, 0]\n            mel = np.pad(\n                mel,\n                ([0, 0], [0, self.feat_frame_len - mel.shape[1]]),\n                mode=\"constant\",\n                constant_values=pad_value.mean(),\n            )\n        return audio, mel\n\n    def shuffle_mapping(self):\n        random.shuffle(self.G_to_D_mappings)\n\n    def load_item(self, idx):\n        \"\"\"load (audio, feat) couple\"\"\"\n        if self.compute_feat:\n            # compute features from wav\n            wavpath = self.item_list[idx]\n            # print(wavpath)\n\n            if self.use_cache and self.cache[idx] is not None:\n                audio, mel = self.cache[idx]\n            else:\n                audio = self.ap.load_wav(wavpath)\n                mel = self.ap.melspectrogram(audio)\n                audio, mel = self._pad_short_samples(audio, mel)\n        else:\n            # load precomputed features\n            wavpath, feat_path = self.item_list[idx]\n\n            if self.use_cache and self.cache[idx] is not None:\n                audio, mel = self.cache[idx]\n            else:\n                audio = self.ap.load_wav(wavpath)\n                mel = np.load(feat_path)\n                audio, mel = self._pad_short_samples(audio, mel)\n\n        # correct the audio length wrt padding applied in stft\n        audio = np.pad(audio, (0, self.hop_len), mode=\"edge\")\n        audio = audio[: mel.shape[-1] * self.hop_len]\n        assert (\n            mel.shape[-1] * self.hop_len == audio.shape[-1]\n        ), f\" [!] {mel.shape[-1] * self.hop_len} vs {audio.shape[-1]}\"\n\n        audio = torch.from_numpy(audio).float().unsqueeze(0)\n        mel = torch.from_numpy(mel).float().squeeze(0)\n\n        if self.return_segments:\n            max_mel_start = mel.shape[1] - self.feat_frame_len\n            mel_start = random.randint(0, max_mel_start)\n            mel_end = mel_start + self.feat_frame_len\n            mel = mel[:, mel_start:mel_end]\n\n            audio_start = mel_start * self.hop_len\n            audio = audio[:, audio_start : audio_start + self.seq_len]\n\n        if self.use_noise_augment and self.is_training and self.return_segments:\n            audio = audio + (1 / 32768) * torch.randn_like(audio)\n        return (mel, audio)\n"
  },
  {
    "path": "TTS/vocoder/datasets/preprocess.py",
    "content": "import glob\nimport os\nfrom pathlib import Path\n\nimport numpy as np\nfrom coqpit import Coqpit\nfrom tqdm import tqdm\n\nfrom TTS.utils.audio import AudioProcessor\n\n\ndef preprocess_wav_files(out_path: str, config: Coqpit, ap: AudioProcessor):\n    \"\"\"Process wav and compute mel and quantized wave signal.\n    It is mainly used by WaveRNN dataloader.\n\n    Args:\n        out_path (str): Parent folder path to save the files.\n        config (Coqpit): Model config.\n        ap (AudioProcessor): Audio processor.\n    \"\"\"\n    os.makedirs(os.path.join(out_path, \"quant\"), exist_ok=True)\n    os.makedirs(os.path.join(out_path, \"mel\"), exist_ok=True)\n    wav_files = find_wav_files(config.data_path)\n    for path in tqdm(wav_files):\n        wav_name = Path(path).stem\n        quant_path = os.path.join(out_path, \"quant\", wav_name + \".npy\")\n        mel_path = os.path.join(out_path, \"mel\", wav_name + \".npy\")\n        y = ap.load_wav(path)\n        mel = ap.melspectrogram(y)\n        np.save(mel_path, mel)\n        if isinstance(config.mode, int):\n            quant = ap.mulaw_encode(y, qc=config.mode) if config.model_args.mulaw else ap.quantize(y, bits=config.mode)\n            np.save(quant_path, quant)\n\n\ndef find_wav_files(data_path, file_ext=\"wav\"):\n    wav_paths = glob.glob(os.path.join(data_path, \"**\", f\"*.{file_ext}\"), recursive=True)\n    return wav_paths\n\n\ndef find_feat_files(data_path):\n    feat_paths = glob.glob(os.path.join(data_path, \"**\", \"*.npy\"), recursive=True)\n    return feat_paths\n\n\ndef load_wav_data(data_path, eval_split_size, file_ext=\"wav\"):\n    wav_paths = find_wav_files(data_path, file_ext=file_ext)\n    assert len(wav_paths) > 0, f\" [!] {data_path} is empty.\"\n    np.random.seed(0)\n    np.random.shuffle(wav_paths)\n    return wav_paths[:eval_split_size], wav_paths[eval_split_size:]\n\n\ndef load_wav_feat_data(data_path, feat_path, eval_split_size):\n    wav_paths = find_wav_files(data_path)\n    feat_paths = find_feat_files(feat_path)\n\n    wav_paths.sort(key=lambda x: Path(x).stem)\n    feat_paths.sort(key=lambda x: Path(x).stem)\n\n    assert len(wav_paths) == len(feat_paths), f\" [!] {len(wav_paths)} vs {feat_paths}\"\n    for wav, feat in zip(wav_paths, feat_paths):\n        wav_name = Path(wav).stem\n        feat_name = Path(feat).stem\n        assert wav_name == feat_name\n\n    items = list(zip(wav_paths, feat_paths))\n    np.random.seed(0)\n    np.random.shuffle(items)\n    return items[:eval_split_size], items[eval_split_size:]\n"
  },
  {
    "path": "TTS/vocoder/datasets/wavegrad_dataset.py",
    "content": "import glob\nimport os\nimport random\nfrom multiprocessing import Manager\nfrom typing import List, Tuple\n\nimport numpy as np\nimport torch\nfrom torch.utils.data import Dataset\n\n\nclass WaveGradDataset(Dataset):\n    \"\"\"\n    WaveGrad Dataset searchs for all the wav files under root path\n    and converts them to acoustic features on the fly and returns\n    random segments of (audio, feature) couples.\n    \"\"\"\n\n    def __init__(\n        self,\n        ap,\n        items,\n        seq_len,\n        hop_len,\n        pad_short,\n        conv_pad=2,\n        is_training=True,\n        return_segments=True,\n        use_noise_augment=False,\n        use_cache=False,\n        verbose=False,\n    ):\n        super().__init__()\n        self.ap = ap\n        self.item_list = items\n        self.seq_len = seq_len if return_segments else None\n        self.hop_len = hop_len\n        self.pad_short = pad_short\n        self.conv_pad = conv_pad\n        self.is_training = is_training\n        self.return_segments = return_segments\n        self.use_cache = use_cache\n        self.use_noise_augment = use_noise_augment\n        self.verbose = verbose\n\n        if return_segments:\n            assert seq_len % hop_len == 0, \" [!] seq_len has to be a multiple of hop_len.\"\n        self.feat_frame_len = seq_len // hop_len + (2 * conv_pad)\n\n        # cache acoustic features\n        if use_cache:\n            self.create_feature_cache()\n\n    def create_feature_cache(self):\n        self.manager = Manager()\n        self.cache = self.manager.list()\n        self.cache += [None for _ in range(len(self.item_list))]\n\n    @staticmethod\n    def find_wav_files(path):\n        return glob.glob(os.path.join(path, \"**\", \"*.wav\"), recursive=True)\n\n    def __len__(self):\n        return len(self.item_list)\n\n    def __getitem__(self, idx):\n        item = self.load_item(idx)\n        return item\n\n    def load_test_samples(self, num_samples: int) -> List[Tuple]:\n        \"\"\"Return test samples.\n\n        Args:\n            num_samples (int): Number of samples to return.\n\n        Returns:\n            List[Tuple]: melspectorgram and audio.\n\n        Shapes:\n            - melspectrogram (Tensor): :math:`[C, T]`\n            - audio (Tensor): :math:`[T_audio]`\n        \"\"\"\n        samples = []\n        return_segments = self.return_segments\n        self.return_segments = False\n        for idx in range(num_samples):\n            mel, audio = self.load_item(idx)\n            samples.append([mel, audio])\n        self.return_segments = return_segments\n        return samples\n\n    def load_item(self, idx):\n        \"\"\"load (audio, feat) couple\"\"\"\n        # compute features from wav\n        wavpath = self.item_list[idx]\n\n        if self.use_cache and self.cache[idx] is not None:\n            audio = self.cache[idx]\n        else:\n            audio = self.ap.load_wav(wavpath)\n\n            if self.return_segments:\n                # correct audio length wrt segment length\n                if audio.shape[-1] < self.seq_len + self.pad_short:\n                    audio = np.pad(\n                        audio, (0, self.seq_len + self.pad_short - len(audio)), mode=\"constant\", constant_values=0.0\n                    )\n                assert (\n                    audio.shape[-1] >= self.seq_len + self.pad_short\n                ), f\"{audio.shape[-1]} vs {self.seq_len + self.pad_short}\"\n\n            # correct the audio length wrt hop length\n            p = (audio.shape[-1] // self.hop_len + 1) * self.hop_len - audio.shape[-1]\n            audio = np.pad(audio, (0, p), mode=\"constant\", constant_values=0.0)\n\n            if self.use_cache:\n                self.cache[idx] = audio\n\n        if self.return_segments:\n            max_start = len(audio) - self.seq_len\n            start = random.randint(0, max_start)\n            end = start + self.seq_len\n            audio = audio[start:end]\n\n        if self.use_noise_augment and self.is_training and self.return_segments:\n            audio = audio + (1 / 32768) * torch.randn_like(audio)\n\n        mel = self.ap.melspectrogram(audio)\n        mel = mel[..., :-1]  # ignore the padding\n\n        audio = torch.from_numpy(audio).float()\n        mel = torch.from_numpy(mel).float().squeeze(0)\n        return (mel, audio)\n\n    @staticmethod\n    def collate_full_clips(batch):\n        \"\"\"This is used in tune_wavegrad.py.\n        It pads sequences to the max length.\"\"\"\n        max_mel_length = max([b[0].shape[1] for b in batch]) if len(batch) > 1 else batch[0][0].shape[1]\n        max_audio_length = max([b[1].shape[0] for b in batch]) if len(batch) > 1 else batch[0][1].shape[0]\n\n        mels = torch.zeros([len(batch), batch[0][0].shape[0], max_mel_length])\n        audios = torch.zeros([len(batch), max_audio_length])\n\n        for idx, b in enumerate(batch):\n            mel = b[0]\n            audio = b[1]\n            mels[idx, :, : mel.shape[1]] = mel\n            audios[idx, : audio.shape[0]] = audio\n\n        return mels, audios\n"
  },
  {
    "path": "TTS/vocoder/datasets/wavernn_dataset.py",
    "content": "import numpy as np\nimport torch\nfrom torch.utils.data import Dataset\n\n\nclass WaveRNNDataset(Dataset):\n    \"\"\"\n    WaveRNN Dataset searchs for all the wav files under root path\n    and converts them to acoustic features on the fly.\n    \"\"\"\n\n    def __init__(\n        self, ap, items, seq_len, hop_len, pad, mode, mulaw, is_training=True, verbose=False, return_segments=True\n    ):\n        super().__init__()\n        self.ap = ap\n        self.compute_feat = not isinstance(items[0], (tuple, list))\n        self.item_list = items\n        self.seq_len = seq_len\n        self.hop_len = hop_len\n        self.mel_len = seq_len // hop_len\n        self.pad = pad\n        self.mode = mode\n        self.mulaw = mulaw\n        self.is_training = is_training\n        self.verbose = verbose\n        self.return_segments = return_segments\n\n        assert self.seq_len % self.hop_len == 0\n\n    def __len__(self):\n        return len(self.item_list)\n\n    def __getitem__(self, index):\n        item = self.load_item(index)\n        return item\n\n    def load_test_samples(self, num_samples):\n        samples = []\n        return_segments = self.return_segments\n        self.return_segments = False\n        for idx in range(num_samples):\n            mel, audio, _ = self.load_item(idx)\n            samples.append([mel, audio])\n        self.return_segments = return_segments\n        return samples\n\n    def load_item(self, index):\n        \"\"\"\n        load (audio, feat) couple if feature_path is set\n        else compute it on the fly\n        \"\"\"\n        if self.compute_feat:\n            wavpath = self.item_list[index]\n            audio = self.ap.load_wav(wavpath)\n            if self.return_segments:\n                min_audio_len = 2 * self.seq_len + (2 * self.pad * self.hop_len)\n            else:\n                min_audio_len = audio.shape[0] + (2 * self.pad * self.hop_len)\n            if audio.shape[0] < min_audio_len:\n                print(\" [!] Instance is too short! : {}\".format(wavpath))\n                audio = np.pad(audio, [0, min_audio_len - audio.shape[0] + self.hop_len])\n            mel = self.ap.melspectrogram(audio)\n\n            if self.mode in [\"gauss\", \"mold\"]:\n                x_input = audio\n            elif isinstance(self.mode, int):\n                x_input = (\n                    self.ap.mulaw_encode(audio, qc=self.mode) if self.mulaw else self.ap.quantize(audio, bits=self.mode)\n                )\n            else:\n                raise RuntimeError(\"Unknown dataset mode - \", self.mode)\n\n        else:\n            wavpath, feat_path = self.item_list[index]\n            mel = np.load(feat_path.replace(\"/quant/\", \"/mel/\"))\n\n            if mel.shape[-1] < self.mel_len + 2 * self.pad:\n                print(\" [!] Instance is too short! : {}\".format(wavpath))\n                self.item_list[index] = self.item_list[index + 1]\n                feat_path = self.item_list[index]\n                mel = np.load(feat_path.replace(\"/quant/\", \"/mel/\"))\n            if self.mode in [\"gauss\", \"mold\"]:\n                x_input = self.ap.load_wav(wavpath)\n            elif isinstance(self.mode, int):\n                x_input = np.load(feat_path.replace(\"/mel/\", \"/quant/\"))\n            else:\n                raise RuntimeError(\"Unknown dataset mode - \", self.mode)\n\n        return mel, x_input, wavpath\n\n    def collate(self, batch):\n        mel_win = self.seq_len // self.hop_len + 2 * self.pad\n        max_offsets = [x[0].shape[-1] - (mel_win + 2 * self.pad) for x in batch]\n\n        mel_offsets = [np.random.randint(0, offset) for offset in max_offsets]\n        sig_offsets = [(offset + self.pad) * self.hop_len for offset in mel_offsets]\n\n        mels = [x[0][:, mel_offsets[i] : mel_offsets[i] + mel_win] for i, x in enumerate(batch)]\n\n        coarse = [x[1][sig_offsets[i] : sig_offsets[i] + self.seq_len + 1] for i, x in enumerate(batch)]\n\n        mels = np.stack(mels).astype(np.float32)\n        if self.mode in [\"gauss\", \"mold\"]:\n            coarse = np.stack(coarse).astype(np.float32)\n            coarse = torch.FloatTensor(coarse)\n            x_input = coarse[:, : self.seq_len]\n        elif isinstance(self.mode, int):\n            coarse = np.stack(coarse).astype(np.int64)\n            coarse = torch.LongTensor(coarse)\n            x_input = 2 * coarse[:, : self.seq_len].float() / (2**self.mode - 1.0) - 1.0\n        y_coarse = coarse[:, 1:]\n        mels = torch.FloatTensor(mels)\n        return x_input, mels, y_coarse\n"
  },
  {
    "path": "TTS/vocoder/layers/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/vocoder/layers/hifigan.py",
    "content": "from torch import nn\n\n\n# pylint: disable=dangerous-default-value\nclass ResStack(nn.Module):\n    def __init__(self, kernel, channel, padding, dilations=[1, 3, 5]):\n        super().__init__()\n        resstack = []\n        for dilation in dilations:\n            resstack += [\n                nn.LeakyReLU(0.2),\n                nn.ReflectionPad1d(dilation),\n                nn.utils.weight_norm(nn.Conv1d(channel, channel, kernel_size=kernel, dilation=dilation)),\n                nn.LeakyReLU(0.2),\n                nn.ReflectionPad1d(padding),\n                nn.utils.weight_norm(nn.Conv1d(channel, channel, kernel_size=1)),\n            ]\n        self.resstack = nn.Sequential(*resstack)\n\n        self.shortcut = nn.utils.weight_norm(nn.Conv1d(channel, channel, kernel_size=1))\n\n    def forward(self, x):\n        x1 = self.shortcut(x)\n        x2 = self.resstack(x)\n        return x1 + x2\n\n    def remove_weight_norm(self):\n        nn.utils.remove_weight_norm(self.shortcut)\n        nn.utils.remove_weight_norm(self.resstack[2])\n        nn.utils.remove_weight_norm(self.resstack[5])\n        nn.utils.remove_weight_norm(self.resstack[8])\n        nn.utils.remove_weight_norm(self.resstack[11])\n        nn.utils.remove_weight_norm(self.resstack[14])\n        nn.utils.remove_weight_norm(self.resstack[17])\n\n\nclass MRF(nn.Module):\n    def __init__(self, kernels, channel, dilations=[1, 3, 5]):  # # pylint: disable=dangerous-default-value\n        super().__init__()\n        self.resblock1 = ResStack(kernels[0], channel, 0, dilations)\n        self.resblock2 = ResStack(kernels[1], channel, 6, dilations)\n        self.resblock3 = ResStack(kernels[2], channel, 12, dilations)\n\n    def forward(self, x):\n        x1 = self.resblock1(x)\n        x2 = self.resblock2(x)\n        x3 = self.resblock3(x)\n        return x1 + x2 + x3\n\n    def remove_weight_norm(self):\n        self.resblock1.remove_weight_norm()\n        self.resblock2.remove_weight_norm()\n        self.resblock3.remove_weight_norm()\n"
  },
  {
    "path": "TTS/vocoder/layers/losses.py",
    "content": "from typing import Dict, Union\n\nimport torch\nfrom torch import nn\nfrom torch.nn import functional as F\n\nfrom TTS.utils.audio.torch_transforms import TorchSTFT\nfrom TTS.vocoder.utils.distribution import discretized_mix_logistic_loss, gaussian_loss\n\n#################################\n# GENERATOR LOSSES\n#################################\n\n\nclass STFTLoss(nn.Module):\n    \"\"\"STFT loss. Input generate and real waveforms are converted\n    to spectrograms compared with L1 and Spectral convergence losses.\n    It is from ParallelWaveGAN paper https://arxiv.org/pdf/1910.11480.pdf\"\"\"\n\n    def __init__(self, n_fft, hop_length, win_length):\n        super().__init__()\n        self.n_fft = n_fft\n        self.hop_length = hop_length\n        self.win_length = win_length\n        self.stft = TorchSTFT(n_fft, hop_length, win_length)\n\n    def forward(self, y_hat, y):\n        y_hat_M = self.stft(y_hat)\n        y_M = self.stft(y)\n        # magnitude loss\n        loss_mag = F.l1_loss(torch.log(y_M), torch.log(y_hat_M))\n        # spectral convergence loss\n        loss_sc = torch.norm(y_M - y_hat_M, p=\"fro\") / torch.norm(y_M, p=\"fro\")\n        return loss_mag, loss_sc\n\n\nclass MultiScaleSTFTLoss(torch.nn.Module):\n    \"\"\"Multi-scale STFT loss. Input generate and real waveforms are converted\n    to spectrograms compared with L1 and Spectral convergence losses.\n    It is from ParallelWaveGAN paper https://arxiv.org/pdf/1910.11480.pdf\"\"\"\n\n    def __init__(self, n_ffts=(1024, 2048, 512), hop_lengths=(120, 240, 50), win_lengths=(600, 1200, 240)):\n        super().__init__()\n        self.loss_funcs = torch.nn.ModuleList()\n        for n_fft, hop_length, win_length in zip(n_ffts, hop_lengths, win_lengths):\n            self.loss_funcs.append(STFTLoss(n_fft, hop_length, win_length))\n\n    def forward(self, y_hat, y):\n        N = len(self.loss_funcs)\n        loss_sc = 0\n        loss_mag = 0\n        for f in self.loss_funcs:\n            lm, lsc = f(y_hat, y)\n            loss_mag += lm\n            loss_sc += lsc\n        loss_sc /= N\n        loss_mag /= N\n        return loss_mag, loss_sc\n\n\nclass L1SpecLoss(nn.Module):\n    \"\"\"L1 Loss over Spectrograms as described in HiFiGAN paper https://arxiv.org/pdf/2010.05646.pdf\"\"\"\n\n    def __init__(\n        self, sample_rate, n_fft, hop_length, win_length, mel_fmin=None, mel_fmax=None, n_mels=None, use_mel=True\n    ):\n        super().__init__()\n        self.use_mel = use_mel\n        self.stft = TorchSTFT(\n            n_fft,\n            hop_length,\n            win_length,\n            sample_rate=sample_rate,\n            mel_fmin=mel_fmin,\n            mel_fmax=mel_fmax,\n            n_mels=n_mels,\n            use_mel=use_mel,\n        )\n\n    def forward(self, y_hat, y):\n        y_hat_M = self.stft(y_hat)\n        y_M = self.stft(y)\n        # magnitude loss\n        loss_mag = F.l1_loss(torch.log(y_M), torch.log(y_hat_M))\n        return loss_mag\n\n\nclass MultiScaleSubbandSTFTLoss(MultiScaleSTFTLoss):\n    \"\"\"Multiscale STFT loss for multi band model outputs.\n    From MultiBand-MelGAN paper https://arxiv.org/abs/2005.05106\"\"\"\n\n    # pylint: disable=no-self-use\n    def forward(self, y_hat, y):\n        y_hat = y_hat.view(-1, 1, y_hat.shape[2])\n        y = y.view(-1, 1, y.shape[2])\n        return super().forward(y_hat.squeeze(1), y.squeeze(1))\n\n\nclass MSEGLoss(nn.Module):\n    \"\"\"Mean Squared Generator Loss\"\"\"\n\n    # pylint: disable=no-self-use\n    def forward(self, score_real):\n        loss_fake = F.mse_loss(score_real, score_real.new_ones(score_real.shape))\n        return loss_fake\n\n\nclass HingeGLoss(nn.Module):\n    \"\"\"Hinge Discriminator Loss\"\"\"\n\n    # pylint: disable=no-self-use\n    def forward(self, score_real):\n        # TODO: this might be wrong\n        loss_fake = torch.mean(F.relu(1.0 - score_real))\n        return loss_fake\n\n\n##################################\n# DISCRIMINATOR LOSSES\n##################################\n\n\nclass MSEDLoss(nn.Module):\n    \"\"\"Mean Squared Discriminator Loss\"\"\"\n\n    def __init__(\n        self,\n    ):\n        super().__init__()\n        self.loss_func = nn.MSELoss()\n\n    # pylint: disable=no-self-use\n    def forward(self, score_fake, score_real):\n        loss_real = self.loss_func(score_real, score_real.new_ones(score_real.shape))\n        loss_fake = self.loss_func(score_fake, score_fake.new_zeros(score_fake.shape))\n        loss_d = loss_real + loss_fake\n        return loss_d, loss_real, loss_fake\n\n\nclass HingeDLoss(nn.Module):\n    \"\"\"Hinge Discriminator Loss\"\"\"\n\n    # pylint: disable=no-self-use\n    def forward(self, score_fake, score_real):\n        loss_real = torch.mean(F.relu(1.0 - score_real))\n        loss_fake = torch.mean(F.relu(1.0 + score_fake))\n        loss_d = loss_real + loss_fake\n        return loss_d, loss_real, loss_fake\n\n\nclass MelganFeatureLoss(nn.Module):\n    def __init__(\n        self,\n    ):\n        super().__init__()\n        self.loss_func = nn.L1Loss()\n\n    # pylint: disable=no-self-use\n    def forward(self, fake_feats, real_feats):\n        loss_feats = 0\n        num_feats = 0\n        for idx, _ in enumerate(fake_feats):\n            for fake_feat, real_feat in zip(fake_feats[idx], real_feats[idx]):\n                loss_feats += self.loss_func(fake_feat, real_feat)\n                num_feats += 1\n        loss_feats = loss_feats / num_feats\n        return loss_feats\n\n\n#####################################\n# LOSS WRAPPERS\n#####################################\n\n\ndef _apply_G_adv_loss(scores_fake, loss_func):\n    \"\"\"Compute G adversarial loss function\n    and normalize values\"\"\"\n    adv_loss = 0\n    if isinstance(scores_fake, list):\n        for score_fake in scores_fake:\n            fake_loss = loss_func(score_fake)\n            adv_loss += fake_loss\n        adv_loss /= len(scores_fake)\n    else:\n        fake_loss = loss_func(scores_fake)\n        adv_loss = fake_loss\n    return adv_loss\n\n\ndef _apply_D_loss(scores_fake, scores_real, loss_func):\n    \"\"\"Compute D loss func and normalize loss values\"\"\"\n    loss = 0\n    real_loss = 0\n    fake_loss = 0\n    if isinstance(scores_fake, list):\n        # multi-scale loss\n        for score_fake, score_real in zip(scores_fake, scores_real):\n            total_loss, real_loss, fake_loss = loss_func(score_fake=score_fake, score_real=score_real)\n            loss += total_loss\n            real_loss += real_loss\n            fake_loss += fake_loss\n        # normalize loss values with number of scales (discriminators)\n        loss /= len(scores_fake)\n        real_loss /= len(scores_real)\n        fake_loss /= len(scores_fake)\n    else:\n        # single scale loss\n        total_loss, real_loss, fake_loss = loss_func(scores_fake, scores_real)\n        loss = total_loss\n    return loss, real_loss, fake_loss\n\n\n##################################\n# MODEL LOSSES\n##################################\n\n\nclass GeneratorLoss(nn.Module):\n    \"\"\"Generator Loss Wrapper. Based on model configuration it sets a right set of loss functions and computes\n    losses. It allows to experiment with different combinations of loss functions with different models by just\n    changing configurations.\n\n    Args:\n        C (AttrDict): model configuration.\n    \"\"\"\n\n    def __init__(self, C):\n        super().__init__()\n        assert not (\n            C.use_mse_gan_loss and C.use_hinge_gan_loss\n        ), \" [!] Cannot use HingeGANLoss and MSEGANLoss together.\"\n\n        self.use_stft_loss = C.use_stft_loss if \"use_stft_loss\" in C else False\n        self.use_subband_stft_loss = C.use_subband_stft_loss if \"use_subband_stft_loss\" in C else False\n        self.use_mse_gan_loss = C.use_mse_gan_loss if \"use_mse_gan_loss\" in C else False\n        self.use_hinge_gan_loss = C.use_hinge_gan_loss if \"use_hinge_gan_loss\" in C else False\n        self.use_feat_match_loss = C.use_feat_match_loss if \"use_feat_match_loss\" in C else False\n        self.use_l1_spec_loss = C.use_l1_spec_loss if \"use_l1_spec_loss\" in C else False\n\n        self.stft_loss_weight = C.stft_loss_weight if \"stft_loss_weight\" in C else 0.0\n        self.subband_stft_loss_weight = C.subband_stft_loss_weight if \"subband_stft_loss_weight\" in C else 0.0\n        self.mse_gan_loss_weight = C.mse_G_loss_weight if \"mse_G_loss_weight\" in C else 0.0\n        self.hinge_gan_loss_weight = C.hinge_G_loss_weight if \"hinde_G_loss_weight\" in C else 0.0\n        self.feat_match_loss_weight = C.feat_match_loss_weight if \"feat_match_loss_weight\" in C else 0.0\n        self.l1_spec_loss_weight = C.l1_spec_loss_weight if \"l1_spec_loss_weight\" in C else 0.0\n\n        if C.use_stft_loss:\n            self.stft_loss = MultiScaleSTFTLoss(**C.stft_loss_params)\n        if C.use_subband_stft_loss:\n            self.subband_stft_loss = MultiScaleSubbandSTFTLoss(**C.subband_stft_loss_params)\n        if C.use_mse_gan_loss:\n            self.mse_loss = MSEGLoss()\n        if C.use_hinge_gan_loss:\n            self.hinge_loss = HingeGLoss()\n        if C.use_feat_match_loss:\n            self.feat_match_loss = MelganFeatureLoss()\n        if C.use_l1_spec_loss:\n            assert C.audio[\"sample_rate\"] == C.l1_spec_loss_params[\"sample_rate\"]\n            self.l1_spec_loss = L1SpecLoss(**C.l1_spec_loss_params)\n\n    def forward(\n        self, y_hat=None, y=None, scores_fake=None, feats_fake=None, feats_real=None, y_hat_sub=None, y_sub=None\n    ):\n        gen_loss = 0\n        adv_loss = 0\n        return_dict = {}\n\n        # STFT Loss\n        if self.use_stft_loss:\n            stft_loss_mg, stft_loss_sc = self.stft_loss(y_hat[:, :, : y.size(2)].squeeze(1), y.squeeze(1))\n            return_dict[\"G_stft_loss_mg\"] = stft_loss_mg\n            return_dict[\"G_stft_loss_sc\"] = stft_loss_sc\n            gen_loss = gen_loss + self.stft_loss_weight * (stft_loss_mg + stft_loss_sc)\n\n        # L1 Spec loss\n        if self.use_l1_spec_loss:\n            l1_spec_loss = self.l1_spec_loss(y_hat, y)\n            return_dict[\"G_l1_spec_loss\"] = l1_spec_loss\n            gen_loss = gen_loss + self.l1_spec_loss_weight * l1_spec_loss\n\n        # subband STFT Loss\n        if self.use_subband_stft_loss:\n            subband_stft_loss_mg, subband_stft_loss_sc = self.subband_stft_loss(y_hat_sub, y_sub)\n            return_dict[\"G_subband_stft_loss_mg\"] = subband_stft_loss_mg\n            return_dict[\"G_subband_stft_loss_sc\"] = subband_stft_loss_sc\n            gen_loss = gen_loss + self.subband_stft_loss_weight * (subband_stft_loss_mg + subband_stft_loss_sc)\n\n        # multiscale MSE adversarial loss\n        if self.use_mse_gan_loss and scores_fake is not None:\n            mse_fake_loss = _apply_G_adv_loss(scores_fake, self.mse_loss)\n            return_dict[\"G_mse_fake_loss\"] = mse_fake_loss\n            adv_loss = adv_loss + self.mse_gan_loss_weight * mse_fake_loss\n\n        # multiscale Hinge adversarial loss\n        if self.use_hinge_gan_loss and not scores_fake is not None:\n            hinge_fake_loss = _apply_G_adv_loss(scores_fake, self.hinge_loss)\n            return_dict[\"G_hinge_fake_loss\"] = hinge_fake_loss\n            adv_loss = adv_loss + self.hinge_gan_loss_weight * hinge_fake_loss\n\n        # Feature Matching Loss\n        if self.use_feat_match_loss and not feats_fake is None:\n            feat_match_loss = self.feat_match_loss(feats_fake, feats_real)\n            return_dict[\"G_feat_match_loss\"] = feat_match_loss\n            adv_loss = adv_loss + self.feat_match_loss_weight * feat_match_loss\n        return_dict[\"loss\"] = gen_loss + adv_loss\n        return_dict[\"G_gen_loss\"] = gen_loss\n        return_dict[\"G_adv_loss\"] = adv_loss\n        return return_dict\n\n\nclass DiscriminatorLoss(nn.Module):\n    \"\"\"Like ```GeneratorLoss```\"\"\"\n\n    def __init__(self, C):\n        super().__init__()\n        assert not (\n            C.use_mse_gan_loss and C.use_hinge_gan_loss\n        ), \" [!] Cannot use HingeGANLoss and MSEGANLoss together.\"\n\n        self.use_mse_gan_loss = C.use_mse_gan_loss\n        self.use_hinge_gan_loss = C.use_hinge_gan_loss\n\n        if C.use_mse_gan_loss:\n            self.mse_loss = MSEDLoss()\n        if C.use_hinge_gan_loss:\n            self.hinge_loss = HingeDLoss()\n\n    def forward(self, scores_fake, scores_real):\n        loss = 0\n        return_dict = {}\n\n        if self.use_mse_gan_loss:\n            mse_D_loss, mse_D_real_loss, mse_D_fake_loss = _apply_D_loss(\n                scores_fake=scores_fake, scores_real=scores_real, loss_func=self.mse_loss\n            )\n            return_dict[\"D_mse_gan_loss\"] = mse_D_loss\n            return_dict[\"D_mse_gan_real_loss\"] = mse_D_real_loss\n            return_dict[\"D_mse_gan_fake_loss\"] = mse_D_fake_loss\n            loss += mse_D_loss\n\n        if self.use_hinge_gan_loss:\n            hinge_D_loss, hinge_D_real_loss, hinge_D_fake_loss = _apply_D_loss(\n                scores_fake=scores_fake, scores_real=scores_real, loss_func=self.hinge_loss\n            )\n            return_dict[\"D_hinge_gan_loss\"] = hinge_D_loss\n            return_dict[\"D_hinge_gan_real_loss\"] = hinge_D_real_loss\n            return_dict[\"D_hinge_gan_fake_loss\"] = hinge_D_fake_loss\n            loss += hinge_D_loss\n\n        return_dict[\"loss\"] = loss\n        return return_dict\n\n\nclass WaveRNNLoss(nn.Module):\n    def __init__(self, wave_rnn_mode: Union[str, int]):\n        super().__init__()\n        if wave_rnn_mode == \"mold\":\n            self.loss_func = discretized_mix_logistic_loss\n        elif wave_rnn_mode == \"gauss\":\n            self.loss_func = gaussian_loss\n        elif isinstance(wave_rnn_mode, int):\n            self.loss_func = torch.nn.CrossEntropyLoss()\n        else:\n            raise ValueError(\" [!] Unknown mode for Wavernn.\")\n\n    def forward(self, y_hat, y) -> Dict:\n        loss = self.loss_func(y_hat, y)\n        return {\"loss\": loss}\n"
  },
  {
    "path": "TTS/vocoder/layers/lvc_block.py",
    "content": "import torch\nimport torch.nn.functional as F\n\n\nclass KernelPredictor(torch.nn.Module):\n    \"\"\"Kernel predictor for the location-variable convolutions\"\"\"\n\n    def __init__(  # pylint: disable=dangerous-default-value\n        self,\n        cond_channels,\n        conv_in_channels,\n        conv_out_channels,\n        conv_layers,\n        conv_kernel_size=3,\n        kpnet_hidden_channels=64,\n        kpnet_conv_size=3,\n        kpnet_dropout=0.0,\n        kpnet_nonlinear_activation=\"LeakyReLU\",\n        kpnet_nonlinear_activation_params={\"negative_slope\": 0.1},\n    ):\n        \"\"\"\n        Args:\n            cond_channels (int): number of channel for the conditioning sequence,\n            conv_in_channels (int): number of channel for the input sequence,\n            conv_out_channels (int): number of channel for the output sequence,\n            conv_layers (int):\n            kpnet_\n        \"\"\"\n        super().__init__()\n\n        self.conv_in_channels = conv_in_channels\n        self.conv_out_channels = conv_out_channels\n        self.conv_kernel_size = conv_kernel_size\n        self.conv_layers = conv_layers\n\n        l_w = conv_in_channels * conv_out_channels * conv_kernel_size * conv_layers\n        l_b = conv_out_channels * conv_layers\n\n        padding = (kpnet_conv_size - 1) // 2\n        self.input_conv = torch.nn.Sequential(\n            torch.nn.Conv1d(cond_channels, kpnet_hidden_channels, 5, padding=(5 - 1) // 2, bias=True),\n            getattr(torch.nn, kpnet_nonlinear_activation)(**kpnet_nonlinear_activation_params),\n        )\n\n        self.residual_conv = torch.nn.Sequential(\n            torch.nn.Dropout(kpnet_dropout),\n            torch.nn.Conv1d(kpnet_hidden_channels, kpnet_hidden_channels, kpnet_conv_size, padding=padding, bias=True),\n            getattr(torch.nn, kpnet_nonlinear_activation)(**kpnet_nonlinear_activation_params),\n            torch.nn.Conv1d(kpnet_hidden_channels, kpnet_hidden_channels, kpnet_conv_size, padding=padding, bias=True),\n            getattr(torch.nn, kpnet_nonlinear_activation)(**kpnet_nonlinear_activation_params),\n            torch.nn.Dropout(kpnet_dropout),\n            torch.nn.Conv1d(kpnet_hidden_channels, kpnet_hidden_channels, kpnet_conv_size, padding=padding, bias=True),\n            getattr(torch.nn, kpnet_nonlinear_activation)(**kpnet_nonlinear_activation_params),\n            torch.nn.Conv1d(kpnet_hidden_channels, kpnet_hidden_channels, kpnet_conv_size, padding=padding, bias=True),\n            getattr(torch.nn, kpnet_nonlinear_activation)(**kpnet_nonlinear_activation_params),\n            torch.nn.Dropout(kpnet_dropout),\n            torch.nn.Conv1d(kpnet_hidden_channels, kpnet_hidden_channels, kpnet_conv_size, padding=padding, bias=True),\n            getattr(torch.nn, kpnet_nonlinear_activation)(**kpnet_nonlinear_activation_params),\n            torch.nn.Conv1d(kpnet_hidden_channels, kpnet_hidden_channels, kpnet_conv_size, padding=padding, bias=True),\n            getattr(torch.nn, kpnet_nonlinear_activation)(**kpnet_nonlinear_activation_params),\n        )\n\n        self.kernel_conv = torch.nn.Conv1d(kpnet_hidden_channels, l_w, kpnet_conv_size, padding=padding, bias=True)\n        self.bias_conv = torch.nn.Conv1d(kpnet_hidden_channels, l_b, kpnet_conv_size, padding=padding, bias=True)\n\n    def forward(self, c):\n        \"\"\"\n        Args:\n            c (Tensor): the conditioning sequence (batch, cond_channels, cond_length)\n        Returns:\n        \"\"\"\n        batch, _, cond_length = c.shape\n\n        c = self.input_conv(c)\n        c = c + self.residual_conv(c)\n        k = self.kernel_conv(c)\n        b = self.bias_conv(c)\n\n        kernels = k.contiguous().view(\n            batch, self.conv_layers, self.conv_in_channels, self.conv_out_channels, self.conv_kernel_size, cond_length\n        )\n        bias = b.contiguous().view(batch, self.conv_layers, self.conv_out_channels, cond_length)\n        return kernels, bias\n\n\nclass LVCBlock(torch.nn.Module):\n    \"\"\"the location-variable convolutions\"\"\"\n\n    def __init__(\n        self,\n        in_channels,\n        cond_channels,\n        upsample_ratio,\n        conv_layers=4,\n        conv_kernel_size=3,\n        cond_hop_length=256,\n        kpnet_hidden_channels=64,\n        kpnet_conv_size=3,\n        kpnet_dropout=0.0,\n    ):\n        super().__init__()\n\n        self.cond_hop_length = cond_hop_length\n        self.conv_layers = conv_layers\n        self.conv_kernel_size = conv_kernel_size\n        self.convs = torch.nn.ModuleList()\n\n        self.upsample = torch.nn.ConvTranspose1d(\n            in_channels,\n            in_channels,\n            kernel_size=upsample_ratio * 2,\n            stride=upsample_ratio,\n            padding=upsample_ratio // 2 + upsample_ratio % 2,\n            output_padding=upsample_ratio % 2,\n        )\n\n        self.kernel_predictor = KernelPredictor(\n            cond_channels=cond_channels,\n            conv_in_channels=in_channels,\n            conv_out_channels=2 * in_channels,\n            conv_layers=conv_layers,\n            conv_kernel_size=conv_kernel_size,\n            kpnet_hidden_channels=kpnet_hidden_channels,\n            kpnet_conv_size=kpnet_conv_size,\n            kpnet_dropout=kpnet_dropout,\n        )\n\n        for i in range(conv_layers):\n            padding = (3**i) * int((conv_kernel_size - 1) / 2)\n            conv = torch.nn.Conv1d(\n                in_channels, in_channels, kernel_size=conv_kernel_size, padding=padding, dilation=3**i\n            )\n\n            self.convs.append(conv)\n\n    def forward(self, x, c):\n        \"\"\"forward propagation of the location-variable convolutions.\n        Args:\n            x (Tensor): the input sequence (batch, in_channels, in_length)\n            c (Tensor): the conditioning sequence (batch, cond_channels, cond_length)\n\n        Returns:\n            Tensor: the output sequence (batch, in_channels, in_length)\n        \"\"\"\n        in_channels = x.shape[1]\n        kernels, bias = self.kernel_predictor(c)\n\n        x = F.leaky_relu(x, 0.2)\n        x = self.upsample(x)\n\n        for i in range(self.conv_layers):\n            y = F.leaky_relu(x, 0.2)\n            y = self.convs[i](y)\n            y = F.leaky_relu(y, 0.2)\n\n            k = kernels[:, i, :, :, :, :]\n            b = bias[:, i, :, :]\n            y = self.location_variable_convolution(y, k, b, 1, self.cond_hop_length)\n            x = x + torch.sigmoid(y[:, :in_channels, :]) * torch.tanh(y[:, in_channels:, :])\n        return x\n\n    @staticmethod\n    def location_variable_convolution(x, kernel, bias, dilation, hop_size):\n        \"\"\"perform location-variable convolution operation on the input sequence (x) using the local convolution kernl.\n        Time: 414 μs ± 309 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each), test on NVIDIA V100.\n        Args:\n            x (Tensor): the input sequence (batch, in_channels, in_length).\n            kernel (Tensor): the local convolution kernel (batch, in_channel, out_channels, kernel_size, kernel_length)\n            bias (Tensor): the bias for the local convolution (batch, out_channels, kernel_length)\n            dilation (int): the dilation of convolution.\n            hop_size (int): the hop_size of the conditioning sequence.\n        Returns:\n            (Tensor): the output sequence after performing local convolution. (batch, out_channels, in_length).\n        \"\"\"\n        batch, _, in_length = x.shape\n        batch, _, out_channels, kernel_size, kernel_length = kernel.shape\n\n        assert in_length == (\n            kernel_length * hop_size\n        ), f\"length of (x, kernel) is not matched, {in_length} vs {kernel_length * hop_size}\"\n\n        padding = dilation * int((kernel_size - 1) / 2)\n        x = F.pad(x, (padding, padding), \"constant\", 0)  # (batch, in_channels, in_length + 2*padding)\n        x = x.unfold(2, hop_size + 2 * padding, hop_size)  # (batch, in_channels, kernel_length, hop_size + 2*padding)\n\n        if hop_size < dilation:\n            x = F.pad(x, (0, dilation), \"constant\", 0)\n        x = x.unfold(\n            3, dilation, dilation\n        )  # (batch, in_channels, kernel_length, (hop_size + 2*padding)/dilation, dilation)\n        x = x[:, :, :, :, :hop_size]\n        x = x.transpose(3, 4)  # (batch, in_channels, kernel_length, dilation, (hop_size + 2*padding)/dilation)\n        x = x.unfold(4, kernel_size, 1)  # (batch, in_channels, kernel_length, dilation, _, kernel_size)\n\n        o = torch.einsum(\"bildsk,biokl->bolsd\", x, kernel)\n        o = o + bias.unsqueeze(-1).unsqueeze(-1)\n        o = o.contiguous().view(batch, out_channels, -1)\n        return o\n"
  },
  {
    "path": "TTS/vocoder/layers/melgan.py",
    "content": "from torch import nn\nfrom torch.nn.utils import weight_norm\n\n\nclass ResidualStack(nn.Module):\n    def __init__(self, channels, num_res_blocks, kernel_size):\n        super().__init__()\n\n        assert (kernel_size - 1) % 2 == 0, \" [!] kernel_size has to be odd.\"\n        base_padding = (kernel_size - 1) // 2\n\n        self.blocks = nn.ModuleList()\n        for idx in range(num_res_blocks):\n            layer_kernel_size = kernel_size\n            layer_dilation = layer_kernel_size**idx\n            layer_padding = base_padding * layer_dilation\n            self.blocks += [\n                nn.Sequential(\n                    nn.LeakyReLU(0.2),\n                    nn.ReflectionPad1d(layer_padding),\n                    weight_norm(\n                        nn.Conv1d(channels, channels, kernel_size=kernel_size, dilation=layer_dilation, bias=True)\n                    ),\n                    nn.LeakyReLU(0.2),\n                    weight_norm(nn.Conv1d(channels, channels, kernel_size=1, bias=True)),\n                )\n            ]\n\n        self.shortcuts = nn.ModuleList(\n            [weight_norm(nn.Conv1d(channels, channels, kernel_size=1, bias=True)) for i in range(num_res_blocks)]\n        )\n\n    def forward(self, x):\n        for block, shortcut in zip(self.blocks, self.shortcuts):\n            x = shortcut(x) + block(x)\n        return x\n\n    def remove_weight_norm(self):\n        for block, shortcut in zip(self.blocks, self.shortcuts):\n            nn.utils.remove_weight_norm(block[2])\n            nn.utils.remove_weight_norm(block[4])\n            nn.utils.remove_weight_norm(shortcut)\n"
  },
  {
    "path": "TTS/vocoder/layers/parallel_wavegan.py",
    "content": "import torch\nfrom torch.nn import functional as F\n\n\nclass ResidualBlock(torch.nn.Module):\n    \"\"\"Residual block module in WaveNet.\"\"\"\n\n    def __init__(\n        self,\n        kernel_size=3,\n        res_channels=64,\n        gate_channels=128,\n        skip_channels=64,\n        aux_channels=80,\n        dropout=0.0,\n        dilation=1,\n        bias=True,\n        use_causal_conv=False,\n    ):\n        super().__init__()\n        self.dropout = dropout\n        # no future time stamps available\n        if use_causal_conv:\n            padding = (kernel_size - 1) * dilation\n        else:\n            assert (kernel_size - 1) % 2 == 0, \"Not support even number kernel size.\"\n            padding = (kernel_size - 1) // 2 * dilation\n        self.use_causal_conv = use_causal_conv\n\n        # dilation conv\n        self.conv = torch.nn.Conv1d(\n            res_channels, gate_channels, kernel_size, padding=padding, dilation=dilation, bias=bias\n        )\n\n        # local conditioning\n        if aux_channels > 0:\n            self.conv1x1_aux = torch.nn.Conv1d(aux_channels, gate_channels, 1, bias=False)\n        else:\n            self.conv1x1_aux = None\n\n        # conv output is split into two groups\n        gate_out_channels = gate_channels // 2\n        self.conv1x1_out = torch.nn.Conv1d(gate_out_channels, res_channels, 1, bias=bias)\n        self.conv1x1_skip = torch.nn.Conv1d(gate_out_channels, skip_channels, 1, bias=bias)\n\n    def forward(self, x, c):\n        \"\"\"\n        x: B x D_res x T\n        c: B x D_aux x T\n        \"\"\"\n        residual = x\n        x = F.dropout(x, p=self.dropout, training=self.training)\n        x = self.conv(x)\n\n        # remove future time steps if use_causal_conv conv\n        x = x[:, :, : residual.size(-1)] if self.use_causal_conv else x\n\n        # split into two part for gated activation\n        splitdim = 1\n        xa, xb = x.split(x.size(splitdim) // 2, dim=splitdim)\n\n        # local conditioning\n        if c is not None:\n            assert self.conv1x1_aux is not None\n            c = self.conv1x1_aux(c)\n            ca, cb = c.split(c.size(splitdim) // 2, dim=splitdim)\n            xa, xb = xa + ca, xb + cb\n\n        x = torch.tanh(xa) * torch.sigmoid(xb)\n\n        # for skip connection\n        s = self.conv1x1_skip(x)\n\n        # for residual connection\n        x = (self.conv1x1_out(x) + residual) * (0.5**2)\n\n        return x, s\n"
  },
  {
    "path": "TTS/vocoder/layers/pqmf.py",
    "content": "import numpy as np\nimport torch\nimport torch.nn.functional as F\nfrom scipy import signal as sig\n\n\n# adapted from\n# https://github.com/kan-bayashi/ParallelWaveGAN/tree/master/parallel_wavegan\nclass PQMF(torch.nn.Module):\n    def __init__(self, N=4, taps=62, cutoff=0.15, beta=9.0):\n        super().__init__()\n\n        self.N = N\n        self.taps = taps\n        self.cutoff = cutoff\n        self.beta = beta\n\n        QMF = sig.firwin(taps + 1, cutoff, window=(\"kaiser\", beta))\n        H = np.zeros((N, len(QMF)))\n        G = np.zeros((N, len(QMF)))\n        for k in range(N):\n            constant_factor = (\n                (2 * k + 1) * (np.pi / (2 * N)) * (np.arange(taps + 1) - ((taps - 1) / 2))\n            )  # TODO: (taps - 1) -> taps\n            phase = (-1) ** k * np.pi / 4\n            H[k] = 2 * QMF * np.cos(constant_factor + phase)\n\n            G[k] = 2 * QMF * np.cos(constant_factor - phase)\n\n        H = torch.from_numpy(H[:, None, :]).float()\n        G = torch.from_numpy(G[None, :, :]).float()\n\n        self.register_buffer(\"H\", H)\n        self.register_buffer(\"G\", G)\n\n        updown_filter = torch.zeros((N, N, N)).float()\n        for k in range(N):\n            updown_filter[k, k, 0] = 1.0\n        self.register_buffer(\"updown_filter\", updown_filter)\n        self.N = N\n\n        self.pad_fn = torch.nn.ConstantPad1d(taps // 2, 0.0)\n\n    def forward(self, x):\n        return self.analysis(x)\n\n    def analysis(self, x):\n        return F.conv1d(x, self.H, padding=self.taps // 2, stride=self.N)\n\n    def synthesis(self, x):\n        x = F.conv_transpose1d(x, self.updown_filter * self.N, stride=self.N)\n        x = F.conv1d(x, self.G, padding=self.taps // 2)\n        return x\n"
  },
  {
    "path": "TTS/vocoder/layers/upsample.py",
    "content": "import torch\nfrom torch.nn import functional as F\n\n\nclass Stretch2d(torch.nn.Module):\n    def __init__(self, x_scale, y_scale, mode=\"nearest\"):\n        super().__init__()\n        self.x_scale = x_scale\n        self.y_scale = y_scale\n        self.mode = mode\n\n    def forward(self, x):\n        \"\"\"\n        x (Tensor): Input tensor (B, C, F, T).\n        Tensor: Interpolated tensor (B, C, F * y_scale, T * x_scale),\n        \"\"\"\n        return F.interpolate(x, scale_factor=(self.y_scale, self.x_scale), mode=self.mode)\n\n\nclass UpsampleNetwork(torch.nn.Module):\n    # pylint: disable=dangerous-default-value\n    def __init__(\n        self,\n        upsample_factors,\n        nonlinear_activation=None,\n        nonlinear_activation_params={},\n        interpolate_mode=\"nearest\",\n        freq_axis_kernel_size=1,\n        use_causal_conv=False,\n    ):\n        super().__init__()\n        self.use_causal_conv = use_causal_conv\n        self.up_layers = torch.nn.ModuleList()\n        for scale in upsample_factors:\n            # interpolation layer\n            stretch = Stretch2d(scale, 1, interpolate_mode)\n            self.up_layers += [stretch]\n\n            # conv layer\n            assert (freq_axis_kernel_size - 1) % 2 == 0, \"Not support even number freq axis kernel size.\"\n            freq_axis_padding = (freq_axis_kernel_size - 1) // 2\n            kernel_size = (freq_axis_kernel_size, scale * 2 + 1)\n            if use_causal_conv:\n                padding = (freq_axis_padding, scale * 2)\n            else:\n                padding = (freq_axis_padding, scale)\n            conv = torch.nn.Conv2d(1, 1, kernel_size=kernel_size, padding=padding, bias=False)\n            self.up_layers += [conv]\n\n            # nonlinear\n            if nonlinear_activation is not None:\n                nonlinear = getattr(torch.nn, nonlinear_activation)(**nonlinear_activation_params)\n                self.up_layers += [nonlinear]\n\n    def forward(self, c):\n        \"\"\"\n        c :  (B, C, T_in).\n        Tensor: (B, C, T_upsample)\n        \"\"\"\n        c = c.unsqueeze(1)  # (B, 1, C, T)\n        for f in self.up_layers:\n            c = f(c)\n        return c.squeeze(1)  # (B, C, T')\n\n\nclass ConvUpsample(torch.nn.Module):\n    # pylint: disable=dangerous-default-value\n    def __init__(\n        self,\n        upsample_factors,\n        nonlinear_activation=None,\n        nonlinear_activation_params={},\n        interpolate_mode=\"nearest\",\n        freq_axis_kernel_size=1,\n        aux_channels=80,\n        aux_context_window=0,\n        use_causal_conv=False,\n    ):\n        super().__init__()\n        self.aux_context_window = aux_context_window\n        self.use_causal_conv = use_causal_conv and aux_context_window > 0\n        # To capture wide-context information in conditional features\n        kernel_size = aux_context_window + 1 if use_causal_conv else 2 * aux_context_window + 1\n        # NOTE(kan-bayashi): Here do not use padding because the input is already padded\n        self.conv_in = torch.nn.Conv1d(aux_channels, aux_channels, kernel_size=kernel_size, bias=False)\n        self.upsample = UpsampleNetwork(\n            upsample_factors=upsample_factors,\n            nonlinear_activation=nonlinear_activation,\n            nonlinear_activation_params=nonlinear_activation_params,\n            interpolate_mode=interpolate_mode,\n            freq_axis_kernel_size=freq_axis_kernel_size,\n            use_causal_conv=use_causal_conv,\n        )\n\n    def forward(self, c):\n        \"\"\"\n        c : (B, C, T_in).\n        Tensor: (B, C, T_upsampled),\n        \"\"\"\n        c_ = self.conv_in(c)\n        c = c_[:, :, : -self.aux_context_window] if self.use_causal_conv else c_\n        return self.upsample(c)\n"
  },
  {
    "path": "TTS/vocoder/layers/wavegrad.py",
    "content": "import torch\nimport torch.nn.functional as F\nfrom torch import nn\nfrom torch.nn.utils import weight_norm\n\n\nclass Conv1d(nn.Conv1d):\n    def __init__(self, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        nn.init.orthogonal_(self.weight)\n        nn.init.zeros_(self.bias)\n\n\nclass PositionalEncoding(nn.Module):\n    \"\"\"Positional encoding with noise level conditioning\"\"\"\n\n    def __init__(self, n_channels, max_len=10000):\n        super().__init__()\n        self.n_channels = n_channels\n        self.max_len = max_len\n        self.C = 5000\n        self.pe = torch.zeros(0, 0)\n\n    def forward(self, x, noise_level):\n        if x.shape[2] > self.pe.shape[1]:\n            self.init_pe_matrix(x.shape[1], x.shape[2], x)\n        return x + noise_level[..., None, None] + self.pe[:, : x.size(2)].repeat(x.shape[0], 1, 1) / self.C\n\n    def init_pe_matrix(self, n_channels, max_len, x):\n        pe = torch.zeros(max_len, n_channels)\n        position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)\n        div_term = torch.pow(10000, torch.arange(0, n_channels, 2).float() / n_channels)\n\n        pe[:, 0::2] = torch.sin(position / div_term)\n        pe[:, 1::2] = torch.cos(position / div_term)\n        self.pe = pe.transpose(0, 1).to(x)\n\n\nclass FiLM(nn.Module):\n    def __init__(self, input_size, output_size):\n        super().__init__()\n        self.encoding = PositionalEncoding(input_size)\n        self.input_conv = nn.Conv1d(input_size, input_size, 3, padding=1)\n        self.output_conv = nn.Conv1d(input_size, output_size * 2, 3, padding=1)\n\n        nn.init.xavier_uniform_(self.input_conv.weight)\n        nn.init.xavier_uniform_(self.output_conv.weight)\n        nn.init.zeros_(self.input_conv.bias)\n        nn.init.zeros_(self.output_conv.bias)\n\n    def forward(self, x, noise_scale):\n        o = self.input_conv(x)\n        o = F.leaky_relu(o, 0.2)\n        o = self.encoding(o, noise_scale)\n        shift, scale = torch.chunk(self.output_conv(o), 2, dim=1)\n        return shift, scale\n\n    def remove_weight_norm(self):\n        nn.utils.remove_weight_norm(self.input_conv)\n        nn.utils.remove_weight_norm(self.output_conv)\n\n    def apply_weight_norm(self):\n        self.input_conv = weight_norm(self.input_conv)\n        self.output_conv = weight_norm(self.output_conv)\n\n\n@torch.jit.script\ndef shif_and_scale(x, scale, shift):\n    o = shift + scale * x\n    return o\n\n\nclass UBlock(nn.Module):\n    def __init__(self, input_size, hidden_size, factor, dilation):\n        super().__init__()\n        assert isinstance(dilation, (list, tuple))\n        assert len(dilation) == 4\n\n        self.factor = factor\n        self.res_block = Conv1d(input_size, hidden_size, 1)\n        self.main_block = nn.ModuleList(\n            [\n                Conv1d(input_size, hidden_size, 3, dilation=dilation[0], padding=dilation[0]),\n                Conv1d(hidden_size, hidden_size, 3, dilation=dilation[1], padding=dilation[1]),\n            ]\n        )\n        self.out_block = nn.ModuleList(\n            [\n                Conv1d(hidden_size, hidden_size, 3, dilation=dilation[2], padding=dilation[2]),\n                Conv1d(hidden_size, hidden_size, 3, dilation=dilation[3], padding=dilation[3]),\n            ]\n        )\n\n    def forward(self, x, shift, scale):\n        x_inter = F.interpolate(x, size=x.shape[-1] * self.factor)\n        res = self.res_block(x_inter)\n        o = F.leaky_relu(x_inter, 0.2)\n        o = F.interpolate(o, size=x.shape[-1] * self.factor)\n        o = self.main_block[0](o)\n        o = shif_and_scale(o, scale, shift)\n        o = F.leaky_relu(o, 0.2)\n        o = self.main_block[1](o)\n        res2 = res + o\n        o = shif_and_scale(res2, scale, shift)\n        o = F.leaky_relu(o, 0.2)\n        o = self.out_block[0](o)\n        o = shif_and_scale(o, scale, shift)\n        o = F.leaky_relu(o, 0.2)\n        o = self.out_block[1](o)\n        o = o + res2\n        return o\n\n    def remove_weight_norm(self):\n        nn.utils.remove_weight_norm(self.res_block)\n        for _, layer in enumerate(self.main_block):\n            if len(layer.state_dict()) != 0:\n                nn.utils.remove_weight_norm(layer)\n        for _, layer in enumerate(self.out_block):\n            if len(layer.state_dict()) != 0:\n                nn.utils.remove_weight_norm(layer)\n\n    def apply_weight_norm(self):\n        self.res_block = weight_norm(self.res_block)\n        for idx, layer in enumerate(self.main_block):\n            if len(layer.state_dict()) != 0:\n                self.main_block[idx] = weight_norm(layer)\n        for idx, layer in enumerate(self.out_block):\n            if len(layer.state_dict()) != 0:\n                self.out_block[idx] = weight_norm(layer)\n\n\nclass DBlock(nn.Module):\n    def __init__(self, input_size, hidden_size, factor):\n        super().__init__()\n        self.factor = factor\n        self.res_block = Conv1d(input_size, hidden_size, 1)\n        self.main_block = nn.ModuleList(\n            [\n                Conv1d(input_size, hidden_size, 3, dilation=1, padding=1),\n                Conv1d(hidden_size, hidden_size, 3, dilation=2, padding=2),\n                Conv1d(hidden_size, hidden_size, 3, dilation=4, padding=4),\n            ]\n        )\n\n    def forward(self, x):\n        size = x.shape[-1] // self.factor\n        res = self.res_block(x)\n        res = F.interpolate(res, size=size)\n        o = F.interpolate(x, size=size)\n        for layer in self.main_block:\n            o = F.leaky_relu(o, 0.2)\n            o = layer(o)\n        return o + res\n\n    def remove_weight_norm(self):\n        nn.utils.remove_weight_norm(self.res_block)\n        for _, layer in enumerate(self.main_block):\n            if len(layer.state_dict()) != 0:\n                nn.utils.remove_weight_norm(layer)\n\n    def apply_weight_norm(self):\n        self.res_block = weight_norm(self.res_block)\n        for idx, layer in enumerate(self.main_block):\n            if len(layer.state_dict()) != 0:\n                self.main_block[idx] = weight_norm(layer)\n"
  },
  {
    "path": "TTS/vocoder/models/__init__.py",
    "content": "import importlib\nimport re\n\nfrom coqpit import Coqpit\n\n\ndef to_camel(text):\n    text = text.capitalize()\n    return re.sub(r\"(?!^)_([a-zA-Z])\", lambda m: m.group(1).upper(), text)\n\n\ndef setup_model(config: Coqpit):\n    \"\"\"Load models directly from configuration.\"\"\"\n    if \"discriminator_model\" in config and \"generator_model\" in config:\n        MyModel = importlib.import_module(\"TTS.vocoder.models.gan\")\n        MyModel = getattr(MyModel, \"GAN\")\n    else:\n        MyModel = importlib.import_module(\"TTS.vocoder.models.\" + config.model.lower())\n        if config.model.lower() == \"wavernn\":\n            MyModel = getattr(MyModel, \"Wavernn\")\n        elif config.model.lower() == \"gan\":\n            MyModel = getattr(MyModel, \"GAN\")\n        elif config.model.lower() == \"wavegrad\":\n            MyModel = getattr(MyModel, \"Wavegrad\")\n        else:\n            try:\n                MyModel = getattr(MyModel, to_camel(config.model))\n            except ModuleNotFoundError as e:\n                raise ValueError(f\"Model {config.model} not exist!\") from e\n    print(\" > Vocoder Model: {}\".format(config.model))\n    return MyModel.init_from_config(config)\n\n\ndef setup_generator(c):\n    \"\"\"TODO: use config object as arguments\"\"\"\n    print(\" > Generator Model: {}\".format(c.generator_model))\n    MyModel = importlib.import_module(\"TTS.vocoder.models.\" + c.generator_model.lower())\n    MyModel = getattr(MyModel, to_camel(c.generator_model))\n    # this is to preserve the Wavernn class name (instead of Wavernn)\n    if c.generator_model.lower() in \"hifigan_generator\":\n        model = MyModel(in_channels=c.audio[\"num_mels\"], out_channels=1, **c.generator_model_params)\n    elif c.generator_model.lower() in \"melgan_generator\":\n        model = MyModel(\n            in_channels=c.audio[\"num_mels\"],\n            out_channels=1,\n            proj_kernel=7,\n            base_channels=512,\n            upsample_factors=c.generator_model_params[\"upsample_factors\"],\n            res_kernel=3,\n            num_res_blocks=c.generator_model_params[\"num_res_blocks\"],\n        )\n    elif c.generator_model in \"melgan_fb_generator\":\n        raise ValueError(\"melgan_fb_generator is now fullband_melgan_generator\")\n    elif c.generator_model.lower() in \"multiband_melgan_generator\":\n        model = MyModel(\n            in_channels=c.audio[\"num_mels\"],\n            out_channels=4,\n            proj_kernel=7,\n            base_channels=384,\n            upsample_factors=c.generator_model_params[\"upsample_factors\"],\n            res_kernel=3,\n            num_res_blocks=c.generator_model_params[\"num_res_blocks\"],\n        )\n    elif c.generator_model.lower() in \"fullband_melgan_generator\":\n        model = MyModel(\n            in_channels=c.audio[\"num_mels\"],\n            out_channels=1,\n            proj_kernel=7,\n            base_channels=512,\n            upsample_factors=c.generator_model_params[\"upsample_factors\"],\n            res_kernel=3,\n            num_res_blocks=c.generator_model_params[\"num_res_blocks\"],\n        )\n    elif c.generator_model.lower() in \"parallel_wavegan_generator\":\n        model = MyModel(\n            in_channels=1,\n            out_channels=1,\n            kernel_size=3,\n            num_res_blocks=c.generator_model_params[\"num_res_blocks\"],\n            stacks=c.generator_model_params[\"stacks\"],\n            res_channels=64,\n            gate_channels=128,\n            skip_channels=64,\n            aux_channels=c.audio[\"num_mels\"],\n            dropout=0.0,\n            bias=True,\n            use_weight_norm=True,\n            upsample_factors=c.generator_model_params[\"upsample_factors\"],\n        )\n    elif c.generator_model.lower() in \"univnet_generator\":\n        model = MyModel(**c.generator_model_params)\n    else:\n        raise NotImplementedError(f\"Model {c.generator_model} not implemented!\")\n    return model\n\n\ndef setup_discriminator(c):\n    \"\"\"TODO: use config objekt as arguments\"\"\"\n    print(\" > Discriminator Model: {}\".format(c.discriminator_model))\n    if \"parallel_wavegan\" in c.discriminator_model:\n        MyModel = importlib.import_module(\"TTS.vocoder.models.parallel_wavegan_discriminator\")\n    else:\n        MyModel = importlib.import_module(\"TTS.vocoder.models.\" + c.discriminator_model.lower())\n    MyModel = getattr(MyModel, to_camel(c.discriminator_model.lower()))\n    if c.discriminator_model in \"hifigan_discriminator\":\n        model = MyModel()\n    if c.discriminator_model in \"random_window_discriminator\":\n        model = MyModel(\n            cond_channels=c.audio[\"num_mels\"],\n            hop_length=c.audio[\"hop_length\"],\n            uncond_disc_donwsample_factors=c.discriminator_model_params[\"uncond_disc_donwsample_factors\"],\n            cond_disc_downsample_factors=c.discriminator_model_params[\"cond_disc_downsample_factors\"],\n            cond_disc_out_channels=c.discriminator_model_params[\"cond_disc_out_channels\"],\n            window_sizes=c.discriminator_model_params[\"window_sizes\"],\n        )\n    if c.discriminator_model in \"melgan_multiscale_discriminator\":\n        model = MyModel(\n            in_channels=1,\n            out_channels=1,\n            kernel_sizes=(5, 3),\n            base_channels=c.discriminator_model_params[\"base_channels\"],\n            max_channels=c.discriminator_model_params[\"max_channels\"],\n            downsample_factors=c.discriminator_model_params[\"downsample_factors\"],\n        )\n    if c.discriminator_model == \"residual_parallel_wavegan_discriminator\":\n        model = MyModel(\n            in_channels=1,\n            out_channels=1,\n            kernel_size=3,\n            num_layers=c.discriminator_model_params[\"num_layers\"],\n            stacks=c.discriminator_model_params[\"stacks\"],\n            res_channels=64,\n            gate_channels=128,\n            skip_channels=64,\n            dropout=0.0,\n            bias=True,\n            nonlinear_activation=\"LeakyReLU\",\n            nonlinear_activation_params={\"negative_slope\": 0.2},\n        )\n    if c.discriminator_model == \"parallel_wavegan_discriminator\":\n        model = MyModel(\n            in_channels=1,\n            out_channels=1,\n            kernel_size=3,\n            num_layers=c.discriminator_model_params[\"num_layers\"],\n            conv_channels=64,\n            dilation_factor=1,\n            nonlinear_activation=\"LeakyReLU\",\n            nonlinear_activation_params={\"negative_slope\": 0.2},\n            bias=True,\n        )\n    if c.discriminator_model == \"univnet_discriminator\":\n        model = MyModel()\n    return model\n"
  },
  {
    "path": "TTS/vocoder/models/base_vocoder.py",
    "content": "from coqpit import Coqpit\n\nfrom TTS.model import BaseTrainerModel\n\n# pylint: skip-file\n\n\nclass BaseVocoder(BaseTrainerModel):\n    \"\"\"Base `vocoder` class. Every new `vocoder` model must inherit this.\n\n    It defines `vocoder` specific functions on top of `Model`.\n\n    Notes on input/output tensor shapes:\n        Any input or output tensor of the model must be shaped as\n\n        - 3D tensors `batch x time x channels`\n        - 2D tensors `batch x channels`\n        - 1D tensors `batch x 1`\n    \"\"\"\n\n    MODEL_TYPE = \"vocoder\"\n\n    def __init__(self, config):\n        super().__init__()\n        self._set_model_args(config)\n\n    def _set_model_args(self, config: Coqpit):\n        \"\"\"Setup model args based on the config type.\n\n        If the config is for training with a name like \"*Config\", then the model args are embeded in the\n        config.model_args\n\n        If the config is for the model with a name like \"*Args\", then we assign the directly.\n        \"\"\"\n        # don't use isintance not to import recursively\n        if \"Config\" in config.__class__.__name__:\n            if \"characters\" in config:\n                _, self.config, num_chars = self.get_characters(config)\n                self.config.num_chars = num_chars\n                if hasattr(self.config, \"model_args\"):\n                    config.model_args.num_chars = num_chars\n                    if \"model_args\" in config:\n                        self.args = self.config.model_args\n                    # This is for backward compatibility\n                    if \"model_params\" in config:\n                        self.args = self.config.model_params\n            else:\n                self.config = config\n                if \"model_args\" in config:\n                    self.args = self.config.model_args\n                # This is for backward compatibility\n                if \"model_params\" in config:\n                    self.args = self.config.model_params\n        else:\n            raise ValueError(\"config must be either a *Config or *Args\")\n"
  },
  {
    "path": "TTS/vocoder/models/fullband_melgan_generator.py",
    "content": "import torch\n\nfrom TTS.vocoder.models.melgan_generator import MelganGenerator\n\n\nclass FullbandMelganGenerator(MelganGenerator):\n    def __init__(\n        self,\n        in_channels=80,\n        out_channels=1,\n        proj_kernel=7,\n        base_channels=512,\n        upsample_factors=(2, 8, 2, 2),\n        res_kernel=3,\n        num_res_blocks=4,\n    ):\n        super().__init__(\n            in_channels=in_channels,\n            out_channels=out_channels,\n            proj_kernel=proj_kernel,\n            base_channels=base_channels,\n            upsample_factors=upsample_factors,\n            res_kernel=res_kernel,\n            num_res_blocks=num_res_blocks,\n        )\n\n    @torch.no_grad()\n    def inference(self, cond_features):\n        cond_features = cond_features.to(self.layers[1].weight.device)\n        cond_features = torch.nn.functional.pad(\n            cond_features, (self.inference_padding, self.inference_padding), \"replicate\"\n        )\n        return self.layers(cond_features)\n"
  },
  {
    "path": "TTS/vocoder/models/gan.py",
    "content": "from inspect import signature\nfrom typing import Dict, List, Tuple\n\nimport numpy as np\nimport torch\nfrom coqpit import Coqpit\nfrom torch import nn\nfrom torch.utils.data import DataLoader\nfrom torch.utils.data.distributed import DistributedSampler\nfrom trainer.trainer_utils import get_optimizer, get_scheduler\n\nfrom TTS.utils.audio import AudioProcessor\nfrom TTS.utils.io import load_fsspec\nfrom TTS.vocoder.datasets.gan_dataset import GANDataset\nfrom TTS.vocoder.layers.losses import DiscriminatorLoss, GeneratorLoss\nfrom TTS.vocoder.models import setup_discriminator, setup_generator\nfrom TTS.vocoder.models.base_vocoder import BaseVocoder\nfrom TTS.vocoder.utils.generic_utils import plot_results\n\n\nclass GAN(BaseVocoder):\n    def __init__(self, config: Coqpit, ap: AudioProcessor = None):\n        \"\"\"Wrap a generator and a discriminator network. It provides a compatible interface for the trainer.\n        It also helps mixing and matching different generator and disciminator networks easily.\n\n        To implement a new GAN models, you just need to define the generator and the discriminator networks, the rest\n        is handled by the `GAN` class.\n\n        Args:\n            config (Coqpit): Model configuration.\n            ap (AudioProcessor): 🐸TTS AudioProcessor instance. Defaults to None.\n\n        Examples:\n            Initializing the GAN model with HifiGAN generator and discriminator.\n            >>> from TTS.vocoder.configs import HifiganConfig\n            >>> config = HifiganConfig()\n            >>> model = GAN(config)\n        \"\"\"\n        super().__init__(config)\n        self.config = config\n        self.model_g = setup_generator(config)\n        self.model_d = setup_discriminator(config)\n        self.train_disc = False  # if False, train only the generator.\n        self.y_hat_g = None  # the last generator prediction to be passed onto the discriminator\n        self.ap = ap\n\n    def forward(self, x: torch.Tensor) -> torch.Tensor:\n        \"\"\"Run the generator's forward pass.\n\n        Args:\n            x (torch.Tensor): Input tensor.\n\n        Returns:\n            torch.Tensor: output of the GAN generator network.\n        \"\"\"\n        return self.model_g.forward(x)\n\n    def inference(self, x: torch.Tensor) -> torch.Tensor:\n        \"\"\"Run the generator's inference pass.\n\n        Args:\n            x (torch.Tensor): Input tensor.\n        Returns:\n            torch.Tensor: output of the GAN generator network.\n        \"\"\"\n        return self.model_g.inference(x)\n\n    def train_step(self, batch: Dict, criterion: Dict, optimizer_idx: int) -> Tuple[Dict, Dict]:\n        \"\"\"Compute model outputs and the loss values. `optimizer_idx` selects the generator or the discriminator for\n        network on the current pass.\n\n        Args:\n            batch (Dict): Batch of samples returned by the dataloader.\n            criterion (Dict): Criterion used to compute the losses.\n            optimizer_idx (int): ID of the optimizer in use on the current pass.\n\n        Raises:\n            ValueError: `optimizer_idx` is an unexpected value.\n\n        Returns:\n            Tuple[Dict, Dict]: model outputs and the computed loss values.\n        \"\"\"\n        outputs = {}\n        loss_dict = {}\n\n        x = batch[\"input\"]\n        y = batch[\"waveform\"]\n\n        if optimizer_idx not in [0, 1]:\n            raise ValueError(\" [!] Unexpected `optimizer_idx`.\")\n\n        if optimizer_idx == 0:\n            # DISCRIMINATOR optimization\n\n            # generator pass\n            y_hat = self.model_g(x)[:, :, : y.size(2)]\n\n            # cache for generator loss\n            # pylint: disable=W0201\n            self.y_hat_g = y_hat\n            self.y_hat_sub = None\n            self.y_sub_g = None\n\n            # PQMF formatting\n            if y_hat.shape[1] > 1:\n                self.y_hat_sub = y_hat\n                y_hat = self.model_g.pqmf_synthesis(y_hat)\n                self.y_hat_g = y_hat  # save for generator loss\n                self.y_sub_g = self.model_g.pqmf_analysis(y)\n\n            scores_fake, feats_fake, feats_real = None, None, None\n\n            if self.train_disc:\n                # use different samples for G and D trainings\n                if self.config.diff_samples_for_G_and_D:\n                    x_d = batch[\"input_disc\"]\n                    y_d = batch[\"waveform_disc\"]\n                    # use a different sample than generator\n                    with torch.no_grad():\n                        y_hat = self.model_g(x_d)\n\n                    # PQMF formatting\n                    if y_hat.shape[1] > 1:\n                        y_hat = self.model_g.pqmf_synthesis(y_hat)\n                else:\n                    # use the same samples as generator\n                    x_d = x.clone()\n                    y_d = y.clone()\n                    y_hat = self.y_hat_g\n\n                # run D with or without cond. features\n                if len(signature(self.model_d.forward).parameters) == 2:\n                    D_out_fake = self.model_d(y_hat.detach().clone(), x_d)\n                    D_out_real = self.model_d(y_d, x_d)\n                else:\n                    D_out_fake = self.model_d(y_hat.detach())\n                    D_out_real = self.model_d(y_d)\n\n                # format D outputs\n                if isinstance(D_out_fake, tuple):\n                    # self.model_d returns scores and features\n                    scores_fake, feats_fake = D_out_fake\n                    if D_out_real is None:\n                        scores_real, feats_real = None, None\n                    else:\n                        scores_real, feats_real = D_out_real\n                else:\n                    # model D returns only scores\n                    scores_fake = D_out_fake\n                    scores_real = D_out_real\n\n                # compute losses\n                loss_dict = criterion[optimizer_idx](scores_fake, scores_real)\n                outputs = {\"model_outputs\": y_hat}\n\n        if optimizer_idx == 1:\n            # GENERATOR loss\n            scores_fake, feats_fake, feats_real = None, None, None\n            if self.train_disc:\n                if len(signature(self.model_d.forward).parameters) == 2:\n                    D_out_fake = self.model_d(self.y_hat_g, x)\n                else:\n                    D_out_fake = self.model_d(self.y_hat_g)\n                D_out_real = None\n\n                if self.config.use_feat_match_loss:\n                    with torch.no_grad():\n                        D_out_real = self.model_d(y)\n\n                # format D outputs\n                if isinstance(D_out_fake, tuple):\n                    scores_fake, feats_fake = D_out_fake\n                    if D_out_real is None:\n                        feats_real = None\n                    else:\n                        _, feats_real = D_out_real\n                else:\n                    scores_fake = D_out_fake\n                    feats_fake, feats_real = None, None\n\n            # compute losses\n            loss_dict = criterion[optimizer_idx](\n                self.y_hat_g, y, scores_fake, feats_fake, feats_real, self.y_hat_sub, self.y_sub_g\n            )\n            outputs = {\"model_outputs\": self.y_hat_g}\n        return outputs, loss_dict\n\n    def _log(self, name: str, ap: AudioProcessor, batch: Dict, outputs: Dict) -> Tuple[Dict, Dict]:\n        \"\"\"Logging shared by the training and evaluation.\n\n        Args:\n            name (str): Name of the run. `train` or `eval`,\n            ap (AudioProcessor): Audio processor used in training.\n            batch (Dict): Batch used in the last train/eval step.\n            outputs (Dict): Model outputs from the last train/eval step.\n\n        Returns:\n            Tuple[Dict, Dict]: log figures and audio samples.\n        \"\"\"\n        y_hat = outputs[0][\"model_outputs\"] if self.train_disc else outputs[1][\"model_outputs\"]\n        y = batch[\"waveform\"]\n        figures = plot_results(y_hat, y, ap, name)\n        sample_voice = y_hat[0].squeeze(0).detach().cpu().numpy()\n        audios = {f\"{name}/audio\": sample_voice}\n        return figures, audios\n\n    def train_log(\n        self, batch: Dict, outputs: Dict, logger: \"Logger\", assets: Dict, steps: int  # pylint: disable=unused-argument\n    ) -> Tuple[Dict, np.ndarray]:\n        \"\"\"Call `_log()` for training.\"\"\"\n        figures, audios = self._log(\"eval\", self.ap, batch, outputs)\n        logger.eval_figures(steps, figures)\n        logger.eval_audios(steps, audios, self.ap.sample_rate)\n\n    @torch.no_grad()\n    def eval_step(self, batch: Dict, criterion: nn.Module, optimizer_idx: int) -> Tuple[Dict, Dict]:\n        \"\"\"Call `train_step()` with `no_grad()`\"\"\"\n        self.train_disc = True  # Avoid a bug in the Training with the missing discriminator loss\n        return self.train_step(batch, criterion, optimizer_idx)\n\n    def eval_log(\n        self, batch: Dict, outputs: Dict, logger: \"Logger\", assets: Dict, steps: int  # pylint: disable=unused-argument\n    ) -> Tuple[Dict, np.ndarray]:\n        \"\"\"Call `_log()` for evaluation.\"\"\"\n        figures, audios = self._log(\"eval\", self.ap, batch, outputs)\n        logger.eval_figures(steps, figures)\n        logger.eval_audios(steps, audios, self.ap.sample_rate)\n\n    def load_checkpoint(\n        self,\n        config: Coqpit,\n        checkpoint_path: str,\n        eval: bool = False,  # pylint: disable=unused-argument, redefined-builtin\n        cache: bool = False,\n    ) -> None:\n        \"\"\"Load a GAN checkpoint and initialize model parameters.\n\n        Args:\n            config (Coqpit): Model config.\n            checkpoint_path (str): Checkpoint file path.\n            eval (bool, optional): If true, load the model for inference. If falseDefaults to False.\n        \"\"\"\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), cache=cache)\n        # band-aid for older than v0.0.15 GAN models\n        if \"model_disc\" in state:\n            self.model_g.load_checkpoint(config, checkpoint_path, eval)\n        else:\n            self.load_state_dict(state[\"model\"])\n            if eval:\n                self.model_d = None\n                if hasattr(self.model_g, \"remove_weight_norm\"):\n                    self.model_g.remove_weight_norm()\n\n    def on_train_step_start(self, trainer) -> None:\n        \"\"\"Enable the discriminator training based on `steps_to_start_discriminator`\n\n        Args:\n            trainer (Trainer): Trainer object.\n        \"\"\"\n        self.train_disc = trainer.total_steps_done >= self.config.steps_to_start_discriminator\n\n    def get_optimizer(self) -> List:\n        \"\"\"Initiate and return the GAN optimizers based on the config parameters.\n\n        It returnes 2 optimizers in a list. First one is for the generator and the second one is for the discriminator.\n\n        Returns:\n            List: optimizers.\n        \"\"\"\n        optimizer1 = get_optimizer(\n            self.config.optimizer, self.config.optimizer_params, self.config.lr_gen, self.model_g\n        )\n        optimizer2 = get_optimizer(\n            self.config.optimizer, self.config.optimizer_params, self.config.lr_disc, self.model_d\n        )\n        return [optimizer2, optimizer1]\n\n    def get_lr(self) -> List:\n        \"\"\"Set the initial learning rates for each optimizer.\n\n        Returns:\n            List: learning rates for each optimizer.\n        \"\"\"\n        return [self.config.lr_disc, self.config.lr_gen]\n\n    def get_scheduler(self, optimizer) -> List:\n        \"\"\"Set the schedulers for each optimizer.\n\n        Args:\n            optimizer (List[`torch.optim.Optimizer`]): List of optimizers.\n\n        Returns:\n            List: Schedulers, one for each optimizer.\n        \"\"\"\n        scheduler1 = get_scheduler(self.config.lr_scheduler_gen, self.config.lr_scheduler_gen_params, optimizer[0])\n        scheduler2 = get_scheduler(self.config.lr_scheduler_disc, self.config.lr_scheduler_disc_params, optimizer[1])\n        return [scheduler2, scheduler1]\n\n    @staticmethod\n    def format_batch(batch: List) -> Dict:\n        \"\"\"Format the batch for training.\n\n        Args:\n            batch (List): Batch out of the dataloader.\n\n        Returns:\n            Dict: formatted model inputs.\n        \"\"\"\n        if isinstance(batch[0], list):\n            x_G, y_G = batch[0]\n            x_D, y_D = batch[1]\n            return {\"input\": x_G, \"waveform\": y_G, \"input_disc\": x_D, \"waveform_disc\": y_D}\n        x, y = batch\n        return {\"input\": x, \"waveform\": y}\n\n    def get_data_loader(  # pylint: disable=no-self-use, unused-argument\n        self,\n        config: Coqpit,\n        assets: Dict,\n        is_eval: True,\n        samples: List,\n        verbose: bool,\n        num_gpus: int,\n        rank: int = None,  # pylint: disable=unused-argument\n    ):\n        \"\"\"Initiate and return the GAN dataloader.\n\n        Args:\n            config (Coqpit): Model config.\n            ap (AudioProcessor): Audio processor.\n            is_eval (True): Set the dataloader for evaluation if true.\n            samples (List): Data samples.\n            verbose (bool): Log information if true.\n            num_gpus (int): Number of GPUs in use.\n            rank (int): Rank of the current GPU. Defaults to None.\n\n        Returns:\n            DataLoader: Torch dataloader.\n        \"\"\"\n        dataset = GANDataset(\n            ap=self.ap,\n            items=samples,\n            seq_len=config.seq_len,\n            hop_len=self.ap.hop_length,\n            pad_short=config.pad_short,\n            conv_pad=config.conv_pad,\n            return_pairs=config.diff_samples_for_G_and_D if \"diff_samples_for_G_and_D\" in config else False,\n            is_training=not is_eval,\n            return_segments=not is_eval,\n            use_noise_augment=config.use_noise_augment,\n            use_cache=config.use_cache,\n            verbose=verbose,\n        )\n        dataset.shuffle_mapping()\n        sampler = DistributedSampler(dataset, shuffle=True) if num_gpus > 1 else None\n        loader = DataLoader(\n            dataset,\n            batch_size=1 if is_eval else config.batch_size,\n            shuffle=num_gpus == 0,\n            drop_last=False,\n            sampler=sampler,\n            num_workers=config.num_eval_loader_workers if is_eval else config.num_loader_workers,\n            pin_memory=False,\n        )\n        return loader\n\n    def get_criterion(self):\n        \"\"\"Return criterions for the optimizers\"\"\"\n        return [DiscriminatorLoss(self.config), GeneratorLoss(self.config)]\n\n    @staticmethod\n    def init_from_config(config: Coqpit, verbose=True) -> \"GAN\":\n        ap = AudioProcessor.init_from_config(config, verbose=verbose)\n        return GAN(config, ap=ap)\n"
  },
  {
    "path": "TTS/vocoder/models/hifigan_discriminator.py",
    "content": "# adopted from https://github.com/jik876/hifi-gan/blob/master/models.py\nimport torch\nfrom torch import nn\nfrom torch.nn import functional as F\n\nLRELU_SLOPE = 0.1\n\n\nclass DiscriminatorP(torch.nn.Module):\n    \"\"\"HiFiGAN Periodic Discriminator\n\n    Takes every Pth value from the input waveform and applied a stack of convoluations.\n\n    Note:\n        if `period` is 2\n        `waveform = [1, 2, 3, 4, 5, 6 ...] --> [1, 3, 5 ... ] --> convs -> score, feat`\n\n    Args:\n        x (Tensor): input waveform.\n\n    Returns:\n        [Tensor]: discriminator scores per sample in the batch.\n        [List[Tensor]]: list of features from each convolutional layer.\n\n    Shapes:\n        x: [B, 1, T]\n    \"\"\"\n\n    def __init__(self, period, kernel_size=5, stride=3, use_spectral_norm=False):\n        super().__init__()\n        self.period = period\n        get_padding = lambda k, d: int((k * d - d) / 2)\n        norm_f = nn.utils.spectral_norm if use_spectral_norm else nn.utils.weight_norm\n        self.convs = nn.ModuleList(\n            [\n                norm_f(nn.Conv2d(1, 32, (kernel_size, 1), (stride, 1), padding=(get_padding(kernel_size, 1), 0))),\n                norm_f(nn.Conv2d(32, 128, (kernel_size, 1), (stride, 1), padding=(get_padding(kernel_size, 1), 0))),\n                norm_f(nn.Conv2d(128, 512, (kernel_size, 1), (stride, 1), padding=(get_padding(kernel_size, 1), 0))),\n                norm_f(nn.Conv2d(512, 1024, (kernel_size, 1), (stride, 1), padding=(get_padding(kernel_size, 1), 0))),\n                norm_f(nn.Conv2d(1024, 1024, (kernel_size, 1), 1, padding=(2, 0))),\n            ]\n        )\n        self.conv_post = norm_f(nn.Conv2d(1024, 1, (3, 1), 1, padding=(1, 0)))\n\n    def forward(self, x):\n        \"\"\"\n        Args:\n            x (Tensor): input waveform.\n\n        Returns:\n            [Tensor]: discriminator scores per sample in the batch.\n            [List[Tensor]]: list of features from each convolutional layer.\n\n        Shapes:\n            x: [B, 1, T]\n        \"\"\"\n        feat = []\n\n        # 1d to 2d\n        b, c, t = x.shape\n        if t % self.period != 0:  # pad first\n            n_pad = self.period - (t % self.period)\n            x = F.pad(x, (0, n_pad), \"reflect\")\n            t = t + n_pad\n        x = x.view(b, c, t // self.period, self.period)\n\n        for l in self.convs:\n            x = l(x)\n            x = F.leaky_relu(x, LRELU_SLOPE)\n            feat.append(x)\n        x = self.conv_post(x)\n        feat.append(x)\n        x = torch.flatten(x, 1, -1)\n\n        return x, feat\n\n\nclass MultiPeriodDiscriminator(torch.nn.Module):\n    \"\"\"HiFiGAN Multi-Period Discriminator (MPD)\n    Wrapper for the `PeriodDiscriminator` to apply it in different periods.\n    Periods are suggested to be prime numbers to reduce the overlap between each discriminator.\n    \"\"\"\n\n    def __init__(self, use_spectral_norm=False):\n        super().__init__()\n        self.discriminators = nn.ModuleList(\n            [\n                DiscriminatorP(2, use_spectral_norm=use_spectral_norm),\n                DiscriminatorP(3, use_spectral_norm=use_spectral_norm),\n                DiscriminatorP(5, use_spectral_norm=use_spectral_norm),\n                DiscriminatorP(7, use_spectral_norm=use_spectral_norm),\n                DiscriminatorP(11, use_spectral_norm=use_spectral_norm),\n            ]\n        )\n\n    def forward(self, x):\n        \"\"\"\n        Args:\n            x (Tensor): input waveform.\n\n        Returns:\n        [List[Tensor]]: list of scores from each discriminator.\n            [List[List[Tensor]]]: list of list of features from each discriminator's each convolutional layer.\n\n        Shapes:\n            x: [B, 1, T]\n        \"\"\"\n        scores = []\n        feats = []\n        for _, d in enumerate(self.discriminators):\n            score, feat = d(x)\n            scores.append(score)\n            feats.append(feat)\n        return scores, feats\n\n\nclass DiscriminatorS(torch.nn.Module):\n    \"\"\"HiFiGAN Scale Discriminator.\n    It is similar to `MelganDiscriminator` but with a specific architecture explained in the paper.\n\n    Args:\n        use_spectral_norm (bool): if `True` swith to spectral norm instead of weight norm.\n\n    \"\"\"\n\n    def __init__(self, use_spectral_norm=False):\n        super().__init__()\n        norm_f = nn.utils.spectral_norm if use_spectral_norm else nn.utils.weight_norm\n        self.convs = nn.ModuleList(\n            [\n                norm_f(nn.Conv1d(1, 128, 15, 1, padding=7)),\n                norm_f(nn.Conv1d(128, 128, 41, 2, groups=4, padding=20)),\n                norm_f(nn.Conv1d(128, 256, 41, 2, groups=16, padding=20)),\n                norm_f(nn.Conv1d(256, 512, 41, 4, groups=16, padding=20)),\n                norm_f(nn.Conv1d(512, 1024, 41, 4, groups=16, padding=20)),\n                norm_f(nn.Conv1d(1024, 1024, 41, 1, groups=16, padding=20)),\n                norm_f(nn.Conv1d(1024, 1024, 5, 1, padding=2)),\n            ]\n        )\n        self.conv_post = norm_f(nn.Conv1d(1024, 1, 3, 1, padding=1))\n\n    def forward(self, x):\n        \"\"\"\n        Args:\n            x (Tensor): input waveform.\n\n        Returns:\n            Tensor: discriminator scores.\n            List[Tensor]: list of features from the convolutiona layers.\n        \"\"\"\n        feat = []\n        for l in self.convs:\n            x = l(x)\n            x = F.leaky_relu(x, LRELU_SLOPE)\n            feat.append(x)\n        x = self.conv_post(x)\n        feat.append(x)\n        x = torch.flatten(x, 1, -1)\n        return x, feat\n\n\nclass MultiScaleDiscriminator(torch.nn.Module):\n    \"\"\"HiFiGAN Multi-Scale Discriminator.\n    It is similar to `MultiScaleMelganDiscriminator` but specially tailored for HiFiGAN as in the paper.\n    \"\"\"\n\n    def __init__(self):\n        super().__init__()\n        self.discriminators = nn.ModuleList(\n            [\n                DiscriminatorS(use_spectral_norm=True),\n                DiscriminatorS(),\n                DiscriminatorS(),\n            ]\n        )\n        self.meanpools = nn.ModuleList([nn.AvgPool1d(4, 2, padding=2), nn.AvgPool1d(4, 2, padding=2)])\n\n    def forward(self, x):\n        \"\"\"\n        Args:\n            x (Tensor): input waveform.\n\n        Returns:\n            List[Tensor]: discriminator scores.\n            List[List[Tensor]]: list of list of features from each layers of each discriminator.\n        \"\"\"\n        scores = []\n        feats = []\n        for i, d in enumerate(self.discriminators):\n            if i != 0:\n                x = self.meanpools[i - 1](x)\n            score, feat = d(x)\n            scores.append(score)\n            feats.append(feat)\n        return scores, feats\n\n\nclass HifiganDiscriminator(nn.Module):\n    \"\"\"HiFiGAN discriminator wrapping MPD and MSD.\"\"\"\n\n    def __init__(self):\n        super().__init__()\n        self.mpd = MultiPeriodDiscriminator()\n        self.msd = MultiScaleDiscriminator()\n\n    def forward(self, x):\n        \"\"\"\n        Args:\n            x (Tensor): input waveform.\n\n        Returns:\n            List[Tensor]: discriminator scores.\n            List[List[Tensor]]: list of list of features from each layers of each discriminator.\n        \"\"\"\n        scores, feats = self.mpd(x)\n        scores_, feats_ = self.msd(x)\n        return scores + scores_, feats + feats_\n"
  },
  {
    "path": "TTS/vocoder/models/hifigan_generator.py",
    "content": "# adopted from https://github.com/jik876/hifi-gan/blob/master/models.py\nimport torch\nfrom torch import nn\nfrom torch.nn import Conv1d, ConvTranspose1d\nfrom torch.nn import functional as F\nfrom torch.nn.utils import remove_weight_norm, weight_norm\n\nfrom TTS.utils.io import load_fsspec\n\nLRELU_SLOPE = 0.1\n\n\ndef get_padding(k, d):\n    return int((k * d - d) / 2)\n\n\nclass ResBlock1(torch.nn.Module):\n    \"\"\"Residual Block Type 1. It has 3 convolutional layers in each convolutional block.\n\n    Network::\n\n        x -> lrelu -> conv1_1 -> conv1_2 -> conv1_3 -> z -> lrelu -> conv2_1 -> conv2_2 -> conv2_3 -> o -> + -> o\n        |--------------------------------------------------------------------------------------------------|\n\n\n    Args:\n        channels (int): number of hidden channels for the convolutional layers.\n        kernel_size (int): size of the convolution filter in each layer.\n        dilations (list): list of dilation value for each conv layer in a block.\n    \"\"\"\n\n    def __init__(self, channels, kernel_size=3, dilation=(1, 3, 5)):\n        super().__init__()\n        self.convs1 = nn.ModuleList(\n            [\n                weight_norm(\n                    Conv1d(\n                        channels,\n                        channels,\n                        kernel_size,\n                        1,\n                        dilation=dilation[0],\n                        padding=get_padding(kernel_size, dilation[0]),\n                    )\n                ),\n                weight_norm(\n                    Conv1d(\n                        channels,\n                        channels,\n                        kernel_size,\n                        1,\n                        dilation=dilation[1],\n                        padding=get_padding(kernel_size, dilation[1]),\n                    )\n                ),\n                weight_norm(\n                    Conv1d(\n                        channels,\n                        channels,\n                        kernel_size,\n                        1,\n                        dilation=dilation[2],\n                        padding=get_padding(kernel_size, dilation[2]),\n                    )\n                ),\n            ]\n        )\n\n        self.convs2 = nn.ModuleList(\n            [\n                weight_norm(\n                    Conv1d(channels, channels, kernel_size, 1, dilation=1, padding=get_padding(kernel_size, 1))\n                ),\n                weight_norm(\n                    Conv1d(channels, channels, kernel_size, 1, dilation=1, padding=get_padding(kernel_size, 1))\n                ),\n                weight_norm(\n                    Conv1d(channels, channels, kernel_size, 1, dilation=1, padding=get_padding(kernel_size, 1))\n                ),\n            ]\n        )\n\n    def forward(self, x):\n        \"\"\"\n        Args:\n            x (Tensor): input tensor.\n        Returns:\n            Tensor: output tensor.\n        Shapes:\n            x: [B, C, T]\n        \"\"\"\n        for c1, c2 in zip(self.convs1, self.convs2):\n            xt = F.leaky_relu(x, LRELU_SLOPE)\n            xt = c1(xt)\n            xt = F.leaky_relu(xt, LRELU_SLOPE)\n            xt = c2(xt)\n            x = xt + x\n        return x\n\n    def remove_weight_norm(self):\n        for l in self.convs1:\n            remove_weight_norm(l)\n        for l in self.convs2:\n            remove_weight_norm(l)\n\n\nclass ResBlock2(torch.nn.Module):\n    \"\"\"Residual Block Type 2. It has 1 convolutional layers in each convolutional block.\n\n    Network::\n\n        x -> lrelu -> conv1-> -> z -> lrelu -> conv2-> o -> + -> o\n        |---------------------------------------------------|\n\n\n    Args:\n        channels (int): number of hidden channels for the convolutional layers.\n        kernel_size (int): size of the convolution filter in each layer.\n        dilations (list): list of dilation value for each conv layer in a block.\n    \"\"\"\n\n    def __init__(self, channels, kernel_size=3, dilation=(1, 3)):\n        super().__init__()\n        self.convs = nn.ModuleList(\n            [\n                weight_norm(\n                    Conv1d(\n                        channels,\n                        channels,\n                        kernel_size,\n                        1,\n                        dilation=dilation[0],\n                        padding=get_padding(kernel_size, dilation[0]),\n                    )\n                ),\n                weight_norm(\n                    Conv1d(\n                        channels,\n                        channels,\n                        kernel_size,\n                        1,\n                        dilation=dilation[1],\n                        padding=get_padding(kernel_size, dilation[1]),\n                    )\n                ),\n            ]\n        )\n\n    def forward(self, x):\n        for c in self.convs:\n            xt = F.leaky_relu(x, LRELU_SLOPE)\n            xt = c(xt)\n            x = xt + x\n        return x\n\n    def remove_weight_norm(self):\n        for l in self.convs:\n            remove_weight_norm(l)\n\n\nclass HifiganGenerator(torch.nn.Module):\n    def __init__(\n        self,\n        in_channels,\n        out_channels,\n        resblock_type,\n        resblock_dilation_sizes,\n        resblock_kernel_sizes,\n        upsample_kernel_sizes,\n        upsample_initial_channel,\n        upsample_factors,\n        inference_padding=5,\n        cond_channels=0,\n        conv_pre_weight_norm=True,\n        conv_post_weight_norm=True,\n        conv_post_bias=True,\n    ):\n        r\"\"\"HiFiGAN Generator with Multi-Receptive Field Fusion (MRF)\n\n        Network:\n            x -> lrelu -> upsampling_layer -> resblock1_k1x1 -> z1 -> + -> z_sum / #resblocks -> lrelu -> conv_post_7x1 -> tanh -> o\n                                                 ..          -> zI ---|\n                                              resblockN_kNx1 -> zN ---'\n\n        Args:\n            in_channels (int): number of input tensor channels.\n            out_channels (int): number of output tensor channels.\n            resblock_type (str): type of the `ResBlock`. '1' or '2'.\n            resblock_dilation_sizes (List[List[int]]): list of dilation values in each layer of a `ResBlock`.\n            resblock_kernel_sizes (List[int]): list of kernel sizes for each `ResBlock`.\n            upsample_kernel_sizes (List[int]): list of kernel sizes for each transposed convolution.\n            upsample_initial_channel (int): number of channels for the first upsampling layer. This is divided by 2\n                for each consecutive upsampling layer.\n            upsample_factors (List[int]): upsampling factors (stride) for each upsampling layer.\n            inference_padding (int): constant padding applied to the input at inference time. Defaults to 5.\n        \"\"\"\n        super().__init__()\n        self.inference_padding = inference_padding\n        self.num_kernels = len(resblock_kernel_sizes)\n        self.num_upsamples = len(upsample_factors)\n        # initial upsampling layers\n        self.conv_pre = weight_norm(Conv1d(in_channels, upsample_initial_channel, 7, 1, padding=3))\n        resblock = ResBlock1 if resblock_type == \"1\" else ResBlock2\n        # upsampling layers\n        self.ups = nn.ModuleList()\n        for i, (u, k) in enumerate(zip(upsample_factors, upsample_kernel_sizes)):\n            self.ups.append(\n                weight_norm(\n                    ConvTranspose1d(\n                        upsample_initial_channel // (2**i),\n                        upsample_initial_channel // (2 ** (i + 1)),\n                        k,\n                        u,\n                        padding=(k - u) // 2,\n                    )\n                )\n            )\n        # MRF blocks\n        self.resblocks = nn.ModuleList()\n        for i in range(len(self.ups)):\n            ch = upsample_initial_channel // (2 ** (i + 1))\n            for _, (k, d) in enumerate(zip(resblock_kernel_sizes, resblock_dilation_sizes)):\n                self.resblocks.append(resblock(ch, k, d))\n        # post convolution layer\n        self.conv_post = weight_norm(Conv1d(ch, out_channels, 7, 1, padding=3, bias=conv_post_bias))\n        if cond_channels > 0:\n            self.cond_layer = nn.Conv1d(cond_channels, upsample_initial_channel, 1)\n\n        if not conv_pre_weight_norm:\n            remove_weight_norm(self.conv_pre)\n\n        if not conv_post_weight_norm:\n            remove_weight_norm(self.conv_post)\n\n    def forward(self, x, g=None):\n        \"\"\"\n        Args:\n            x (Tensor): feature input tensor.\n            g (Tensor): global conditioning input tensor.\n\n        Returns:\n            Tensor: output waveform.\n\n        Shapes:\n            x: [B, C, T]\n            Tensor: [B, 1, T]\n        \"\"\"\n        o = self.conv_pre(x)\n        if hasattr(self, \"cond_layer\"):\n            o = o + self.cond_layer(g)\n        for i in range(self.num_upsamples):\n            o = F.leaky_relu(o, LRELU_SLOPE)\n            o = self.ups[i](o)\n            z_sum = None\n            for j in range(self.num_kernels):\n                if z_sum is None:\n                    z_sum = self.resblocks[i * self.num_kernels + j](o)\n                else:\n                    z_sum += self.resblocks[i * self.num_kernels + j](o)\n            o = z_sum / self.num_kernels\n        o = F.leaky_relu(o)\n        o = self.conv_post(o)\n        o = torch.tanh(o)\n        return o\n\n    @torch.no_grad()\n    def inference(self, c):\n        \"\"\"\n        Args:\n            x (Tensor): conditioning input tensor.\n\n        Returns:\n            Tensor: output waveform.\n\n        Shapes:\n            x: [B, C, T]\n            Tensor: [B, 1, T]\n        \"\"\"\n        c = c.to(self.conv_pre.weight.device)\n        c = torch.nn.functional.pad(c, (self.inference_padding, self.inference_padding), \"replicate\")\n        return self.forward(c)\n\n    def remove_weight_norm(self):\n        print(\"Removing weight norm...\")\n        for l in self.ups:\n            remove_weight_norm(l)\n        for l in self.resblocks:\n            l.remove_weight_norm()\n        remove_weight_norm(self.conv_pre)\n        remove_weight_norm(self.conv_post)\n\n    def load_checkpoint(\n        self, config, checkpoint_path, eval=False, cache=False\n    ):  # pylint: disable=unused-argument, redefined-builtin\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), cache=cache)\n        self.load_state_dict(state[\"model\"])\n        if eval:\n            self.eval()\n            assert not self.training\n            self.remove_weight_norm()\n"
  },
  {
    "path": "TTS/vocoder/models/melgan_discriminator.py",
    "content": "import numpy as np\nfrom torch import nn\nfrom torch.nn.utils import weight_norm\n\n\nclass MelganDiscriminator(nn.Module):\n    def __init__(\n        self,\n        in_channels=1,\n        out_channels=1,\n        kernel_sizes=(5, 3),\n        base_channels=16,\n        max_channels=1024,\n        downsample_factors=(4, 4, 4, 4),\n        groups_denominator=4,\n    ):\n        super().__init__()\n        self.layers = nn.ModuleList()\n\n        layer_kernel_size = np.prod(kernel_sizes)\n        layer_padding = (layer_kernel_size - 1) // 2\n\n        # initial layer\n        self.layers += [\n            nn.Sequential(\n                nn.ReflectionPad1d(layer_padding),\n                weight_norm(nn.Conv1d(in_channels, base_channels, layer_kernel_size, stride=1)),\n                nn.LeakyReLU(0.2, inplace=True),\n            )\n        ]\n\n        # downsampling layers\n        layer_in_channels = base_channels\n        for downsample_factor in downsample_factors:\n            layer_out_channels = min(layer_in_channels * downsample_factor, max_channels)\n            layer_kernel_size = downsample_factor * 10 + 1\n            layer_padding = (layer_kernel_size - 1) // 2\n            layer_groups = layer_in_channels // groups_denominator\n            self.layers += [\n                nn.Sequential(\n                    weight_norm(\n                        nn.Conv1d(\n                            layer_in_channels,\n                            layer_out_channels,\n                            kernel_size=layer_kernel_size,\n                            stride=downsample_factor,\n                            padding=layer_padding,\n                            groups=layer_groups,\n                        )\n                    ),\n                    nn.LeakyReLU(0.2, inplace=True),\n                )\n            ]\n            layer_in_channels = layer_out_channels\n\n        # last 2 layers\n        layer_padding1 = (kernel_sizes[0] - 1) // 2\n        layer_padding2 = (kernel_sizes[1] - 1) // 2\n        self.layers += [\n            nn.Sequential(\n                weight_norm(\n                    nn.Conv1d(\n                        layer_out_channels,\n                        layer_out_channels,\n                        kernel_size=kernel_sizes[0],\n                        stride=1,\n                        padding=layer_padding1,\n                    )\n                ),\n                nn.LeakyReLU(0.2, inplace=True),\n            ),\n            weight_norm(\n                nn.Conv1d(\n                    layer_out_channels, out_channels, kernel_size=kernel_sizes[1], stride=1, padding=layer_padding2\n                )\n            ),\n        ]\n\n    def forward(self, x):\n        feats = []\n        for layer in self.layers:\n            x = layer(x)\n            feats.append(x)\n        return x, feats\n"
  },
  {
    "path": "TTS/vocoder/models/melgan_generator.py",
    "content": "import torch\nfrom torch import nn\nfrom torch.nn.utils import weight_norm\n\nfrom TTS.utils.io import load_fsspec\nfrom TTS.vocoder.layers.melgan import ResidualStack\n\n\nclass MelganGenerator(nn.Module):\n    def __init__(\n        self,\n        in_channels=80,\n        out_channels=1,\n        proj_kernel=7,\n        base_channels=512,\n        upsample_factors=(8, 8, 2, 2),\n        res_kernel=3,\n        num_res_blocks=3,\n    ):\n        super().__init__()\n\n        # assert model parameters\n        assert (proj_kernel - 1) % 2 == 0, \" [!] proj_kernel should be an odd number.\"\n\n        # setup additional model parameters\n        base_padding = (proj_kernel - 1) // 2\n        act_slope = 0.2\n        self.inference_padding = 2\n\n        # initial layer\n        layers = []\n        layers += [\n            nn.ReflectionPad1d(base_padding),\n            weight_norm(nn.Conv1d(in_channels, base_channels, kernel_size=proj_kernel, stride=1, bias=True)),\n        ]\n\n        # upsampling layers and residual stacks\n        for idx, upsample_factor in enumerate(upsample_factors):\n            layer_in_channels = base_channels // (2**idx)\n            layer_out_channels = base_channels // (2 ** (idx + 1))\n            layer_filter_size = upsample_factor * 2\n            layer_stride = upsample_factor\n            layer_output_padding = upsample_factor % 2\n            layer_padding = upsample_factor // 2 + layer_output_padding\n            layers += [\n                nn.LeakyReLU(act_slope),\n                weight_norm(\n                    nn.ConvTranspose1d(\n                        layer_in_channels,\n                        layer_out_channels,\n                        layer_filter_size,\n                        stride=layer_stride,\n                        padding=layer_padding,\n                        output_padding=layer_output_padding,\n                        bias=True,\n                    )\n                ),\n                ResidualStack(channels=layer_out_channels, num_res_blocks=num_res_blocks, kernel_size=res_kernel),\n            ]\n\n        layers += [nn.LeakyReLU(act_slope)]\n\n        # final layer\n        layers += [\n            nn.ReflectionPad1d(base_padding),\n            weight_norm(nn.Conv1d(layer_out_channels, out_channels, proj_kernel, stride=1, bias=True)),\n            nn.Tanh(),\n        ]\n        self.layers = nn.Sequential(*layers)\n\n    def forward(self, c):\n        return self.layers(c)\n\n    def inference(self, c):\n        c = c.to(self.layers[1].weight.device)\n        c = torch.nn.functional.pad(c, (self.inference_padding, self.inference_padding), \"replicate\")\n        return self.layers(c)\n\n    def remove_weight_norm(self):\n        for _, layer in enumerate(self.layers):\n            if len(layer.state_dict()) != 0:\n                try:\n                    nn.utils.remove_weight_norm(layer)\n                except ValueError:\n                    layer.remove_weight_norm()\n\n    def load_checkpoint(\n        self, config, checkpoint_path, eval=False, cache=False\n    ):  # pylint: disable=unused-argument, redefined-builtin\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), cache=cache)\n        self.load_state_dict(state[\"model\"])\n        if eval:\n            self.eval()\n            assert not self.training\n            self.remove_weight_norm()\n"
  },
  {
    "path": "TTS/vocoder/models/melgan_multiscale_discriminator.py",
    "content": "from torch import nn\n\nfrom TTS.vocoder.models.melgan_discriminator import MelganDiscriminator\n\n\nclass MelganMultiscaleDiscriminator(nn.Module):\n    def __init__(\n        self,\n        in_channels=1,\n        out_channels=1,\n        num_scales=3,\n        kernel_sizes=(5, 3),\n        base_channels=16,\n        max_channels=1024,\n        downsample_factors=(4, 4, 4),\n        pooling_kernel_size=4,\n        pooling_stride=2,\n        pooling_padding=2,\n        groups_denominator=4,\n    ):\n        super().__init__()\n\n        self.discriminators = nn.ModuleList(\n            [\n                MelganDiscriminator(\n                    in_channels=in_channels,\n                    out_channels=out_channels,\n                    kernel_sizes=kernel_sizes,\n                    base_channels=base_channels,\n                    max_channels=max_channels,\n                    downsample_factors=downsample_factors,\n                    groups_denominator=groups_denominator,\n                )\n                for _ in range(num_scales)\n            ]\n        )\n\n        self.pooling = nn.AvgPool1d(\n            kernel_size=pooling_kernel_size, stride=pooling_stride, padding=pooling_padding, count_include_pad=False\n        )\n\n    def forward(self, x):\n        scores = []\n        feats = []\n        for disc in self.discriminators:\n            score, feat = disc(x)\n            scores.append(score)\n            feats.append(feat)\n            x = self.pooling(x)\n        return scores, feats\n"
  },
  {
    "path": "TTS/vocoder/models/multiband_melgan_generator.py",
    "content": "import torch\n\nfrom TTS.vocoder.layers.pqmf import PQMF\nfrom TTS.vocoder.models.melgan_generator import MelganGenerator\n\n\nclass MultibandMelganGenerator(MelganGenerator):\n    def __init__(\n        self,\n        in_channels=80,\n        out_channels=4,\n        proj_kernel=7,\n        base_channels=384,\n        upsample_factors=(2, 8, 2, 2),\n        res_kernel=3,\n        num_res_blocks=3,\n    ):\n        super().__init__(\n            in_channels=in_channels,\n            out_channels=out_channels,\n            proj_kernel=proj_kernel,\n            base_channels=base_channels,\n            upsample_factors=upsample_factors,\n            res_kernel=res_kernel,\n            num_res_blocks=num_res_blocks,\n        )\n        self.pqmf_layer = PQMF(N=4, taps=62, cutoff=0.15, beta=9.0)\n\n    def pqmf_analysis(self, x):\n        return self.pqmf_layer.analysis(x)\n\n    def pqmf_synthesis(self, x):\n        return self.pqmf_layer.synthesis(x)\n\n    @torch.no_grad()\n    def inference(self, cond_features):\n        cond_features = cond_features.to(self.layers[1].weight.device)\n        cond_features = torch.nn.functional.pad(\n            cond_features, (self.inference_padding, self.inference_padding), \"replicate\"\n        )\n        return self.pqmf_synthesis(self.layers(cond_features))\n"
  },
  {
    "path": "TTS/vocoder/models/parallel_wavegan_discriminator.py",
    "content": "import math\n\nimport torch\nfrom torch import nn\n\nfrom TTS.vocoder.layers.parallel_wavegan import ResidualBlock\n\n\nclass ParallelWaveganDiscriminator(nn.Module):\n    \"\"\"PWGAN discriminator as in https://arxiv.org/abs/1910.11480.\n    It classifies each audio window real/fake and returns a sequence\n    of predictions.\n        It is a stack of convolutional blocks with dilation.\n    \"\"\"\n\n    # pylint: disable=dangerous-default-value\n    def __init__(\n        self,\n        in_channels=1,\n        out_channels=1,\n        kernel_size=3,\n        num_layers=10,\n        conv_channels=64,\n        dilation_factor=1,\n        nonlinear_activation=\"LeakyReLU\",\n        nonlinear_activation_params={\"negative_slope\": 0.2},\n        bias=True,\n    ):\n        super().__init__()\n        assert (kernel_size - 1) % 2 == 0, \" [!] does not support even number kernel size.\"\n        assert dilation_factor > 0, \" [!] dilation factor must be > 0.\"\n        self.conv_layers = nn.ModuleList()\n        conv_in_channels = in_channels\n        for i in range(num_layers - 1):\n            if i == 0:\n                dilation = 1\n            else:\n                dilation = i if dilation_factor == 1 else dilation_factor**i\n                conv_in_channels = conv_channels\n            padding = (kernel_size - 1) // 2 * dilation\n            conv_layer = [\n                nn.Conv1d(\n                    conv_in_channels,\n                    conv_channels,\n                    kernel_size=kernel_size,\n                    padding=padding,\n                    dilation=dilation,\n                    bias=bias,\n                ),\n                getattr(nn, nonlinear_activation)(inplace=True, **nonlinear_activation_params),\n            ]\n            self.conv_layers += conv_layer\n        padding = (kernel_size - 1) // 2\n        last_conv_layer = nn.Conv1d(conv_in_channels, out_channels, kernel_size=kernel_size, padding=padding, bias=bias)\n        self.conv_layers += [last_conv_layer]\n        self.apply_weight_norm()\n\n    def forward(self, x):\n        \"\"\"\n            x : (B, 1, T).\n        Returns:\n            Tensor: (B, 1, T)\n        \"\"\"\n        for f in self.conv_layers:\n            x = f(x)\n        return x\n\n    def apply_weight_norm(self):\n        def _apply_weight_norm(m):\n            if isinstance(m, (torch.nn.Conv1d, torch.nn.Conv2d)):\n                torch.nn.utils.weight_norm(m)\n\n        self.apply(_apply_weight_norm)\n\n    def remove_weight_norm(self):\n        def _remove_weight_norm(m):\n            try:\n                # print(f\"Weight norm is removed from {m}.\")\n                nn.utils.remove_weight_norm(m)\n            except ValueError:  # this module didn't have weight norm\n                return\n\n        self.apply(_remove_weight_norm)\n\n\nclass ResidualParallelWaveganDiscriminator(nn.Module):\n    # pylint: disable=dangerous-default-value\n    def __init__(\n        self,\n        in_channels=1,\n        out_channels=1,\n        kernel_size=3,\n        num_layers=30,\n        stacks=3,\n        res_channels=64,\n        gate_channels=128,\n        skip_channels=64,\n        dropout=0.0,\n        bias=True,\n        nonlinear_activation=\"LeakyReLU\",\n        nonlinear_activation_params={\"negative_slope\": 0.2},\n    ):\n        super().__init__()\n        assert (kernel_size - 1) % 2 == 0, \"Not support even number kernel size.\"\n\n        self.in_channels = in_channels\n        self.out_channels = out_channels\n        self.num_layers = num_layers\n        self.stacks = stacks\n        self.kernel_size = kernel_size\n        self.res_factor = math.sqrt(1.0 / num_layers)\n\n        # check the number of num_layers and stacks\n        assert num_layers % stacks == 0\n        layers_per_stack = num_layers // stacks\n\n        # define first convolution\n        self.first_conv = nn.Sequential(\n            nn.Conv1d(in_channels, res_channels, kernel_size=1, padding=0, dilation=1, bias=True),\n            getattr(nn, nonlinear_activation)(inplace=True, **nonlinear_activation_params),\n        )\n\n        # define residual blocks\n        self.conv_layers = nn.ModuleList()\n        for layer in range(num_layers):\n            dilation = 2 ** (layer % layers_per_stack)\n            conv = ResidualBlock(\n                kernel_size=kernel_size,\n                res_channels=res_channels,\n                gate_channels=gate_channels,\n                skip_channels=skip_channels,\n                aux_channels=-1,\n                dilation=dilation,\n                dropout=dropout,\n                bias=bias,\n                use_causal_conv=False,\n            )\n            self.conv_layers += [conv]\n\n        # define output layers\n        self.last_conv_layers = nn.ModuleList(\n            [\n                getattr(nn, nonlinear_activation)(inplace=True, **nonlinear_activation_params),\n                nn.Conv1d(skip_channels, skip_channels, kernel_size=1, padding=0, dilation=1, bias=True),\n                getattr(nn, nonlinear_activation)(inplace=True, **nonlinear_activation_params),\n                nn.Conv1d(skip_channels, out_channels, kernel_size=1, padding=0, dilation=1, bias=True),\n            ]\n        )\n\n        # apply weight norm\n        self.apply_weight_norm()\n\n    def forward(self, x):\n        \"\"\"\n        x: (B, 1, T).\n        \"\"\"\n        x = self.first_conv(x)\n\n        skips = 0\n        for f in self.conv_layers:\n            x, h = f(x, None)\n            skips += h\n        skips *= self.res_factor\n\n        # apply final layers\n        x = skips\n        for f in self.last_conv_layers:\n            x = f(x)\n        return x\n\n    def apply_weight_norm(self):\n        def _apply_weight_norm(m):\n            if isinstance(m, (torch.nn.Conv1d, torch.nn.Conv2d)):\n                torch.nn.utils.weight_norm(m)\n\n        self.apply(_apply_weight_norm)\n\n    def remove_weight_norm(self):\n        def _remove_weight_norm(m):\n            try:\n                print(f\"Weight norm is removed from {m}.\")\n                nn.utils.remove_weight_norm(m)\n            except ValueError:  # this module didn't have weight norm\n                return\n\n        self.apply(_remove_weight_norm)\n"
  },
  {
    "path": "TTS/vocoder/models/parallel_wavegan_generator.py",
    "content": "import math\n\nimport numpy as np\nimport torch\n\nfrom TTS.utils.io import load_fsspec\nfrom TTS.vocoder.layers.parallel_wavegan import ResidualBlock\nfrom TTS.vocoder.layers.upsample import ConvUpsample\n\n\nclass ParallelWaveganGenerator(torch.nn.Module):\n    \"\"\"PWGAN generator as in https://arxiv.org/pdf/1910.11480.pdf.\n    It is similar to WaveNet with no causal convolution.\n        It is conditioned on an aux feature (spectrogram) to generate\n    an output waveform from an input noise.\n    \"\"\"\n\n    # pylint: disable=dangerous-default-value\n    def __init__(\n        self,\n        in_channels=1,\n        out_channels=1,\n        kernel_size=3,\n        num_res_blocks=30,\n        stacks=3,\n        res_channels=64,\n        gate_channels=128,\n        skip_channels=64,\n        aux_channels=80,\n        dropout=0.0,\n        bias=True,\n        use_weight_norm=True,\n        upsample_factors=[4, 4, 4, 4],\n        inference_padding=2,\n    ):\n        super().__init__()\n        self.in_channels = in_channels\n        self.out_channels = out_channels\n        self.aux_channels = aux_channels\n        self.num_res_blocks = num_res_blocks\n        self.stacks = stacks\n        self.kernel_size = kernel_size\n        self.upsample_factors = upsample_factors\n        self.upsample_scale = np.prod(upsample_factors)\n        self.inference_padding = inference_padding\n        self.use_weight_norm = use_weight_norm\n\n        # check the number of layers and stacks\n        assert num_res_blocks % stacks == 0\n        layers_per_stack = num_res_blocks // stacks\n\n        # define first convolution\n        self.first_conv = torch.nn.Conv1d(in_channels, res_channels, kernel_size=1, bias=True)\n\n        # define conv + upsampling network\n        self.upsample_net = ConvUpsample(upsample_factors=upsample_factors)\n\n        # define residual blocks\n        self.conv_layers = torch.nn.ModuleList()\n        for layer in range(num_res_blocks):\n            dilation = 2 ** (layer % layers_per_stack)\n            conv = ResidualBlock(\n                kernel_size=kernel_size,\n                res_channels=res_channels,\n                gate_channels=gate_channels,\n                skip_channels=skip_channels,\n                aux_channels=aux_channels,\n                dilation=dilation,\n                dropout=dropout,\n                bias=bias,\n            )\n            self.conv_layers += [conv]\n\n        # define output layers\n        self.last_conv_layers = torch.nn.ModuleList(\n            [\n                torch.nn.ReLU(inplace=True),\n                torch.nn.Conv1d(skip_channels, skip_channels, kernel_size=1, bias=True),\n                torch.nn.ReLU(inplace=True),\n                torch.nn.Conv1d(skip_channels, out_channels, kernel_size=1, bias=True),\n            ]\n        )\n\n        # apply weight norm\n        if use_weight_norm:\n            self.apply_weight_norm()\n\n    def forward(self, c):\n        \"\"\"\n        c: (B, C ,T').\n        o: Output tensor (B, out_channels, T)\n        \"\"\"\n        # random noise\n        x = torch.randn([c.shape[0], 1, c.shape[2] * self.upsample_scale])\n        x = x.to(self.first_conv.bias.device)\n\n        # perform upsampling\n        if c is not None and self.upsample_net is not None:\n            c = self.upsample_net(c)\n            assert (\n                c.shape[-1] == x.shape[-1]\n            ), f\" [!] Upsampling scale does not match the expected output. {c.shape} vs {x.shape}\"\n\n        # encode to hidden representation\n        x = self.first_conv(x)\n        skips = 0\n        for f in self.conv_layers:\n            x, h = f(x, c)\n            skips += h\n        skips *= math.sqrt(1.0 / len(self.conv_layers))\n\n        # apply final layers\n        x = skips\n        for f in self.last_conv_layers:\n            x = f(x)\n\n        return x\n\n    @torch.no_grad()\n    def inference(self, c):\n        c = c.to(self.first_conv.weight.device)\n        c = torch.nn.functional.pad(c, (self.inference_padding, self.inference_padding), \"replicate\")\n        return self.forward(c)\n\n    def remove_weight_norm(self):\n        def _remove_weight_norm(m):\n            try:\n                # print(f\"Weight norm is removed from {m}.\")\n                torch.nn.utils.remove_weight_norm(m)\n            except ValueError:  # this module didn't have weight norm\n                return\n\n        self.apply(_remove_weight_norm)\n\n    def apply_weight_norm(self):\n        def _apply_weight_norm(m):\n            if isinstance(m, (torch.nn.Conv1d, torch.nn.Conv2d)):\n                torch.nn.utils.weight_norm(m)\n                # print(f\"Weight norm is applied to {m}.\")\n\n        self.apply(_apply_weight_norm)\n\n    @staticmethod\n    def _get_receptive_field_size(layers, stacks, kernel_size, dilation=lambda x: 2**x):\n        assert layers % stacks == 0\n        layers_per_cycle = layers // stacks\n        dilations = [dilation(i % layers_per_cycle) for i in range(layers)]\n        return (kernel_size - 1) * sum(dilations) + 1\n\n    @property\n    def receptive_field_size(self):\n        return self._get_receptive_field_size(self.layers, self.stacks, self.kernel_size)\n\n    def load_checkpoint(\n        self, config, checkpoint_path, eval=False, cache=False\n    ):  # pylint: disable=unused-argument, redefined-builtin\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), cache=cache)\n        self.load_state_dict(state[\"model\"])\n        if eval:\n            self.eval()\n            assert not self.training\n            if self.use_weight_norm:\n                self.remove_weight_norm()\n"
  },
  {
    "path": "TTS/vocoder/models/random_window_discriminator.py",
    "content": "import numpy as np\nfrom torch import nn\n\n\nclass GBlock(nn.Module):\n    def __init__(self, in_channels, cond_channels, downsample_factor):\n        super().__init__()\n\n        self.in_channels = in_channels\n        self.cond_channels = cond_channels\n        self.downsample_factor = downsample_factor\n\n        self.start = nn.Sequential(\n            nn.AvgPool1d(downsample_factor, stride=downsample_factor),\n            nn.ReLU(),\n            nn.Conv1d(in_channels, in_channels * 2, kernel_size=3, padding=1),\n        )\n        self.lc_conv1d = nn.Conv1d(cond_channels, in_channels * 2, kernel_size=1)\n        self.end = nn.Sequential(\n            nn.ReLU(), nn.Conv1d(in_channels * 2, in_channels * 2, kernel_size=3, dilation=2, padding=2)\n        )\n        self.residual = nn.Sequential(\n            nn.Conv1d(in_channels, in_channels * 2, kernel_size=1),\n            nn.AvgPool1d(downsample_factor, stride=downsample_factor),\n        )\n\n    def forward(self, inputs, conditions):\n        outputs = self.start(inputs) + self.lc_conv1d(conditions)\n        outputs = self.end(outputs)\n        residual_outputs = self.residual(inputs)\n        outputs = outputs + residual_outputs\n\n        return outputs\n\n\nclass DBlock(nn.Module):\n    def __init__(self, in_channels, out_channels, downsample_factor):\n        super().__init__()\n\n        self.in_channels = in_channels\n        self.downsample_factor = downsample_factor\n        self.out_channels = out_channels\n\n        self.donwsample_layer = nn.AvgPool1d(downsample_factor, stride=downsample_factor)\n        self.layers = nn.Sequential(\n            nn.ReLU(),\n            nn.Conv1d(in_channels, out_channels, kernel_size=3, padding=1),\n            nn.ReLU(),\n            nn.Conv1d(out_channels, out_channels, kernel_size=3, dilation=2, padding=2),\n        )\n        self.residual = nn.Sequential(\n            nn.Conv1d(in_channels, out_channels, kernel_size=1),\n        )\n\n    def forward(self, inputs):\n        if self.downsample_factor > 1:\n            outputs = self.layers(self.donwsample_layer(inputs)) + self.donwsample_layer(self.residual(inputs))\n        else:\n            outputs = self.layers(inputs) + self.residual(inputs)\n        return outputs\n\n\nclass ConditionalDiscriminator(nn.Module):\n    def __init__(self, in_channels, cond_channels, downsample_factors=(2, 2, 2), out_channels=(128, 256)):\n        super().__init__()\n\n        assert len(downsample_factors) == len(out_channels) + 1\n\n        self.in_channels = in_channels\n        self.cond_channels = cond_channels\n        self.downsample_factors = downsample_factors\n        self.out_channels = out_channels\n\n        self.pre_cond_layers = nn.ModuleList()\n        self.post_cond_layers = nn.ModuleList()\n\n        # layers before condition features\n        self.pre_cond_layers += [DBlock(in_channels, 64, 1)]\n        in_channels = 64\n        for i, channel in enumerate(out_channels):\n            self.pre_cond_layers.append(DBlock(in_channels, channel, downsample_factors[i]))\n            in_channels = channel\n\n        # condition block\n        self.cond_block = GBlock(in_channels, cond_channels, downsample_factors[-1])\n\n        # layers after condition block\n        self.post_cond_layers += [\n            DBlock(in_channels * 2, in_channels * 2, 1),\n            DBlock(in_channels * 2, in_channels * 2, 1),\n            nn.AdaptiveAvgPool1d(1),\n            nn.Conv1d(in_channels * 2, 1, kernel_size=1),\n        ]\n\n    def forward(self, inputs, conditions):\n        batch_size = inputs.size()[0]\n        outputs = inputs.view(batch_size, self.in_channels, -1)\n        for layer in self.pre_cond_layers:\n            outputs = layer(outputs)\n        outputs = self.cond_block(outputs, conditions)\n        for layer in self.post_cond_layers:\n            outputs = layer(outputs)\n\n        return outputs\n\n\nclass UnconditionalDiscriminator(nn.Module):\n    def __init__(self, in_channels, base_channels=64, downsample_factors=(8, 4), out_channels=(128, 256)):\n        super().__init__()\n\n        self.downsample_factors = downsample_factors\n        self.in_channels = in_channels\n        self.downsample_factors = downsample_factors\n        self.out_channels = out_channels\n\n        self.layers = nn.ModuleList()\n        self.layers += [DBlock(self.in_channels, base_channels, 1)]\n        in_channels = base_channels\n        for i, factor in enumerate(downsample_factors):\n            self.layers.append(DBlock(in_channels, out_channels[i], factor))\n            in_channels *= 2\n        self.layers += [\n            DBlock(in_channels, in_channels, 1),\n            DBlock(in_channels, in_channels, 1),\n            nn.AdaptiveAvgPool1d(1),\n            nn.Conv1d(in_channels, 1, kernel_size=1),\n        ]\n\n    def forward(self, inputs):\n        batch_size = inputs.size()[0]\n        outputs = inputs.view(batch_size, self.in_channels, -1)\n        for layer in self.layers:\n            outputs = layer(outputs)\n        return outputs\n\n\nclass RandomWindowDiscriminator(nn.Module):\n    \"\"\"Random Window Discriminator as described in\n    http://arxiv.org/abs/1909.11646\"\"\"\n\n    def __init__(\n        self,\n        cond_channels,\n        hop_length,\n        uncond_disc_donwsample_factors=(8, 4),\n        cond_disc_downsample_factors=((8, 4, 2, 2, 2), (8, 4, 2, 2), (8, 4, 2), (8, 4), (4, 2, 2)),\n        cond_disc_out_channels=((128, 128, 256, 256), (128, 256, 256), (128, 256), (256,), (128, 256)),\n        window_sizes=(512, 1024, 2048, 4096, 8192),\n    ):\n        super().__init__()\n        self.cond_channels = cond_channels\n        self.window_sizes = window_sizes\n        self.hop_length = hop_length\n        self.base_window_size = self.hop_length * 2\n        self.ks = [ws // self.base_window_size for ws in window_sizes]\n\n        # check arguments\n        assert len(cond_disc_downsample_factors) == len(cond_disc_out_channels) == len(window_sizes)\n        for ws in window_sizes:\n            assert ws % hop_length == 0\n\n        for idx, cf in enumerate(cond_disc_downsample_factors):\n            assert np.prod(cf) == hop_length // self.ks[idx]\n\n        # define layers\n        self.unconditional_discriminators = nn.ModuleList([])\n        for k in self.ks:\n            layer = UnconditionalDiscriminator(\n                in_channels=k, base_channels=64, downsample_factors=uncond_disc_donwsample_factors\n            )\n            self.unconditional_discriminators.append(layer)\n\n        self.conditional_discriminators = nn.ModuleList([])\n        for idx, k in enumerate(self.ks):\n            layer = ConditionalDiscriminator(\n                in_channels=k,\n                cond_channels=cond_channels,\n                downsample_factors=cond_disc_downsample_factors[idx],\n                out_channels=cond_disc_out_channels[idx],\n            )\n            self.conditional_discriminators.append(layer)\n\n    def forward(self, x, c):\n        scores = []\n        feats = []\n        # unconditional pass\n        for window_size, layer in zip(self.window_sizes, self.unconditional_discriminators):\n            index = np.random.randint(x.shape[-1] - window_size)\n\n            score = layer(x[:, :, index : index + window_size])\n            scores.append(score)\n\n        # conditional pass\n        for window_size, layer in zip(self.window_sizes, self.conditional_discriminators):\n            frame_size = window_size // self.hop_length\n            lc_index = np.random.randint(c.shape[-1] - frame_size)\n            sample_index = lc_index * self.hop_length\n            x_sub = x[:, :, sample_index : (lc_index + frame_size) * self.hop_length]\n            c_sub = c[:, :, lc_index : lc_index + frame_size]\n\n            score = layer(x_sub, c_sub)\n            scores.append(score)\n        return scores, feats\n"
  },
  {
    "path": "TTS/vocoder/models/univnet_discriminator.py",
    "content": "import torch\nimport torch.nn.functional as F\nfrom torch import nn\nfrom torch.nn.utils import spectral_norm, weight_norm\n\nfrom TTS.utils.audio.torch_transforms import TorchSTFT\nfrom TTS.vocoder.models.hifigan_discriminator import MultiPeriodDiscriminator\n\nLRELU_SLOPE = 0.1\n\n\nclass SpecDiscriminator(nn.Module):\n    \"\"\"docstring for Discriminator.\"\"\"\n\n    def __init__(self, fft_size=1024, hop_length=120, win_length=600, use_spectral_norm=False):\n        super().__init__()\n        norm_f = weight_norm if use_spectral_norm is False else spectral_norm\n        self.fft_size = fft_size\n        self.hop_length = hop_length\n        self.win_length = win_length\n        self.stft = TorchSTFT(fft_size, hop_length, win_length)\n        self.discriminators = nn.ModuleList(\n            [\n                norm_f(nn.Conv2d(1, 32, kernel_size=(3, 9), padding=(1, 4))),\n                norm_f(nn.Conv2d(32, 32, kernel_size=(3, 9), stride=(1, 2), padding=(1, 4))),\n                norm_f(nn.Conv2d(32, 32, kernel_size=(3, 9), stride=(1, 2), padding=(1, 4))),\n                norm_f(nn.Conv2d(32, 32, kernel_size=(3, 9), stride=(1, 2), padding=(1, 4))),\n                norm_f(nn.Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))),\n            ]\n        )\n\n        self.out = norm_f(nn.Conv2d(32, 1, 3, 1, 1))\n\n    def forward(self, y):\n        fmap = []\n        with torch.no_grad():\n            y = y.squeeze(1)\n            y = self.stft(y)\n        y = y.unsqueeze(1)\n        for _, d in enumerate(self.discriminators):\n            y = d(y)\n            y = F.leaky_relu(y, LRELU_SLOPE)\n            fmap.append(y)\n\n        y = self.out(y)\n        fmap.append(y)\n\n        return torch.flatten(y, 1, -1), fmap\n\n\nclass MultiResSpecDiscriminator(torch.nn.Module):\n    def __init__(  # pylint: disable=dangerous-default-value\n        self, fft_sizes=[1024, 2048, 512], hop_sizes=[120, 240, 50], win_lengths=[600, 1200, 240], window=\"hann_window\"\n    ):\n        super().__init__()\n        self.discriminators = nn.ModuleList(\n            [\n                SpecDiscriminator(fft_sizes[0], hop_sizes[0], win_lengths[0], window),\n                SpecDiscriminator(fft_sizes[1], hop_sizes[1], win_lengths[1], window),\n                SpecDiscriminator(fft_sizes[2], hop_sizes[2], win_lengths[2], window),\n            ]\n        )\n\n    def forward(self, x):\n        scores = []\n        feats = []\n        for d in self.discriminators:\n            score, feat = d(x)\n            scores.append(score)\n            feats.append(feat)\n\n        return scores, feats\n\n\nclass UnivnetDiscriminator(nn.Module):\n    \"\"\"Univnet discriminator wrapping MPD and MSD.\"\"\"\n\n    def __init__(self):\n        super().__init__()\n        self.mpd = MultiPeriodDiscriminator()\n        self.msd = MultiResSpecDiscriminator()\n\n    def forward(self, x):\n        \"\"\"\n        Args:\n            x (Tensor): input waveform.\n\n        Returns:\n            List[Tensor]: discriminator scores.\n            List[List[Tensor]]: list of list of features from each layers of each discriminator.\n        \"\"\"\n        scores, feats = self.mpd(x)\n        scores_, feats_ = self.msd(x)\n        return scores + scores_, feats + feats_\n"
  },
  {
    "path": "TTS/vocoder/models/univnet_generator.py",
    "content": "from typing import List\n\nimport numpy as np\nimport torch\nimport torch.nn.functional as F\n\nfrom TTS.vocoder.layers.lvc_block import LVCBlock\n\nLRELU_SLOPE = 0.1\n\n\nclass UnivnetGenerator(torch.nn.Module):\n    def __init__(\n        self,\n        in_channels: int,\n        out_channels: int,\n        hidden_channels: int,\n        cond_channels: int,\n        upsample_factors: List[int],\n        lvc_layers_each_block: int,\n        lvc_kernel_size: int,\n        kpnet_hidden_channels: int,\n        kpnet_conv_size: int,\n        dropout: float,\n        use_weight_norm=True,\n    ):\n        \"\"\"Univnet Generator network.\n\n        Paper: https://arxiv.org/pdf/2106.07889.pdf\n\n        Args:\n            in_channels (int): Number of input tensor channels.\n            out_channels (int): Number of channels of the output tensor.\n            hidden_channels (int): Number of hidden network channels.\n            cond_channels (int): Number of channels of the conditioning tensors.\n            upsample_factors (List[int]): List of uplsample factors for the upsampling layers.\n            lvc_layers_each_block (int): Number of LVC layers in each block.\n            lvc_kernel_size (int): Kernel size of the LVC layers.\n            kpnet_hidden_channels (int): Number of hidden channels in the key-point network.\n            kpnet_conv_size (int): Number of convolution channels in the key-point network.\n            dropout (float): Dropout rate.\n            use_weight_norm (bool, optional): Enable/disable weight norm. Defaults to True.\n        \"\"\"\n\n        super().__init__()\n        self.in_channels = in_channels\n        self.out_channels = out_channels\n        self.cond_channels = cond_channels\n        self.upsample_scale = np.prod(upsample_factors)\n        self.lvc_block_nums = len(upsample_factors)\n\n        # define first convolution\n        self.first_conv = torch.nn.Conv1d(\n            in_channels, hidden_channels, kernel_size=7, padding=(7 - 1) // 2, dilation=1, bias=True\n        )\n\n        # define residual blocks\n        self.lvc_blocks = torch.nn.ModuleList()\n        cond_hop_length = 1\n        for n in range(self.lvc_block_nums):\n            cond_hop_length = cond_hop_length * upsample_factors[n]\n            lvcb = LVCBlock(\n                in_channels=hidden_channels,\n                cond_channels=cond_channels,\n                upsample_ratio=upsample_factors[n],\n                conv_layers=lvc_layers_each_block,\n                conv_kernel_size=lvc_kernel_size,\n                cond_hop_length=cond_hop_length,\n                kpnet_hidden_channels=kpnet_hidden_channels,\n                kpnet_conv_size=kpnet_conv_size,\n                kpnet_dropout=dropout,\n            )\n            self.lvc_blocks += [lvcb]\n\n        # define output layers\n        self.last_conv_layers = torch.nn.ModuleList(\n            [\n                torch.nn.Conv1d(\n                    hidden_channels, out_channels, kernel_size=7, padding=(7 - 1) // 2, dilation=1, bias=True\n                ),\n            ]\n        )\n\n        # apply weight norm\n        if use_weight_norm:\n            self.apply_weight_norm()\n\n    def forward(self, c):\n        \"\"\"Calculate forward propagation.\n        Args:\n            c (Tensor): Local conditioning auxiliary features (B, C ,T').\n        Returns:\n            Tensor: Output tensor (B, out_channels, T)\n        \"\"\"\n        # random noise\n        x = torch.randn([c.shape[0], self.in_channels, c.shape[2]])\n        x = x.to(self.first_conv.bias.device)\n        x = self.first_conv(x)\n\n        for n in range(self.lvc_block_nums):\n            x = self.lvc_blocks[n](x, c)\n\n        # apply final layers\n        for f in self.last_conv_layers:\n            x = F.leaky_relu(x, LRELU_SLOPE)\n            x = f(x)\n        x = torch.tanh(x)\n        return x\n\n    def remove_weight_norm(self):\n        \"\"\"Remove weight normalization module from all of the layers.\"\"\"\n\n        def _remove_weight_norm(m):\n            try:\n                # print(f\"Weight norm is removed from {m}.\")\n                torch.nn.utils.remove_weight_norm(m)\n            except ValueError:  # this module didn't have weight norm\n                return\n\n        self.apply(_remove_weight_norm)\n\n    def apply_weight_norm(self):\n        \"\"\"Apply weight normalization module from all of the layers.\"\"\"\n\n        def _apply_weight_norm(m):\n            if isinstance(m, (torch.nn.Conv1d, torch.nn.Conv2d)):\n                torch.nn.utils.weight_norm(m)\n                # print(f\"Weight norm is applied to {m}.\")\n\n        self.apply(_apply_weight_norm)\n\n    @staticmethod\n    def _get_receptive_field_size(layers, stacks, kernel_size, dilation=lambda x: 2**x):\n        assert layers % stacks == 0\n        layers_per_cycle = layers // stacks\n        dilations = [dilation(i % layers_per_cycle) for i in range(layers)]\n        return (kernel_size - 1) * sum(dilations) + 1\n\n    @property\n    def receptive_field_size(self):\n        \"\"\"Return receptive field size.\"\"\"\n        return self._get_receptive_field_size(self.layers, self.stacks, self.kernel_size)\n\n    @torch.no_grad()\n    def inference(self, c):\n        \"\"\"Perform inference.\n        Args:\n            c (Tensor): Local conditioning auxiliary features :math:`(B, C, T)`.\n        Returns:\n            Tensor: Output tensor (T, out_channels)\n        \"\"\"\n        x = torch.randn([c.shape[0], self.in_channels, c.shape[2]])\n        x = x.to(self.first_conv.bias.device)\n\n        c = c.to(next(self.parameters()))\n        return self.forward(c)\n"
  },
  {
    "path": "TTS/vocoder/models/wavegrad.py",
    "content": "from dataclasses import dataclass, field\nfrom typing import Dict, List, Tuple\n\nimport numpy as np\nimport torch\nfrom coqpit import Coqpit\nfrom torch import nn\nfrom torch.nn.utils import weight_norm\nfrom torch.utils.data import DataLoader\nfrom torch.utils.data.distributed import DistributedSampler\nfrom trainer.trainer_utils import get_optimizer, get_scheduler\n\nfrom TTS.utils.io import load_fsspec\nfrom TTS.vocoder.datasets import WaveGradDataset\nfrom TTS.vocoder.layers.wavegrad import Conv1d, DBlock, FiLM, UBlock\nfrom TTS.vocoder.models.base_vocoder import BaseVocoder\nfrom TTS.vocoder.utils.generic_utils import plot_results\n\n\n@dataclass\nclass WavegradArgs(Coqpit):\n    in_channels: int = 80\n    out_channels: int = 1\n    use_weight_norm: bool = False\n    y_conv_channels: int = 32\n    x_conv_channels: int = 768\n    dblock_out_channels: List[int] = field(default_factory=lambda: [128, 128, 256, 512])\n    ublock_out_channels: List[int] = field(default_factory=lambda: [512, 512, 256, 128, 128])\n    upsample_factors: List[int] = field(default_factory=lambda: [4, 4, 4, 2, 2])\n    upsample_dilations: List[List[int]] = field(\n        default_factory=lambda: [[1, 2, 1, 2], [1, 2, 1, 2], [1, 2, 4, 8], [1, 2, 4, 8], [1, 2, 4, 8]]\n    )\n\n\nclass Wavegrad(BaseVocoder):\n    \"\"\"🐸 🌊 WaveGrad 🌊 model.\n    Paper - https://arxiv.org/abs/2009.00713\n\n    Examples:\n        Initializing the model.\n\n        >>> from TTS.vocoder.configs import WavegradConfig\n        >>> config = WavegradConfig()\n        >>> model = Wavegrad(config)\n\n    Paper Abstract:\n        This paper introduces WaveGrad, a conditional model for waveform generation which estimates gradients of the\n        data density. The model is built on prior work on score matching and diffusion probabilistic models. It starts\n        from a Gaussian white noise signal and iteratively refines the signal via a gradient-based sampler conditioned\n        on the mel-spectrogram. WaveGrad offers a natural way to trade inference speed for sample quality by adjusting\n        the number of refinement steps, and bridges the gap between non-autoregressive and autoregressive models in\n        terms of audio quality. We find that it can generate high fidelity audio samples using as few as six iterations.\n        Experiments reveal WaveGrad to generate high fidelity audio, outperforming adversarial non-autoregressive\n        baselines and matching a strong likelihood-based autoregressive baseline using fewer sequential operations.\n        Audio samples are available at this https URL.\n    \"\"\"\n\n    # pylint: disable=dangerous-default-value\n    def __init__(self, config: Coqpit):\n        super().__init__(config)\n        self.config = config\n        self.use_weight_norm = config.model_params.use_weight_norm\n        self.hop_len = np.prod(config.model_params.upsample_factors)\n        self.noise_level = None\n        self.num_steps = None\n        self.beta = None\n        self.alpha = None\n        self.alpha_hat = None\n        self.c1 = None\n        self.c2 = None\n        self.sigma = None\n\n        # dblocks\n        self.y_conv = Conv1d(1, config.model_params.y_conv_channels, 5, padding=2)\n        self.dblocks = nn.ModuleList([])\n        ic = config.model_params.y_conv_channels\n        for oc, df in zip(config.model_params.dblock_out_channels, reversed(config.model_params.upsample_factors)):\n            self.dblocks.append(DBlock(ic, oc, df))\n            ic = oc\n\n        # film\n        self.film = nn.ModuleList([])\n        ic = config.model_params.y_conv_channels\n        for oc in reversed(config.model_params.ublock_out_channels):\n            self.film.append(FiLM(ic, oc))\n            ic = oc\n\n        # ublocksn\n        self.ublocks = nn.ModuleList([])\n        ic = config.model_params.x_conv_channels\n        for oc, uf, ud in zip(\n            config.model_params.ublock_out_channels,\n            config.model_params.upsample_factors,\n            config.model_params.upsample_dilations,\n        ):\n            self.ublocks.append(UBlock(ic, oc, uf, ud))\n            ic = oc\n\n        self.x_conv = Conv1d(config.model_params.in_channels, config.model_params.x_conv_channels, 3, padding=1)\n        self.out_conv = Conv1d(oc, config.model_params.out_channels, 3, padding=1)\n\n        if config.model_params.use_weight_norm:\n            self.apply_weight_norm()\n\n    def forward(self, x, spectrogram, noise_scale):\n        shift_and_scale = []\n\n        x = self.y_conv(x)\n        shift_and_scale.append(self.film[0](x, noise_scale))\n\n        for film, layer in zip(self.film[1:], self.dblocks):\n            x = layer(x)\n            shift_and_scale.append(film(x, noise_scale))\n\n        x = self.x_conv(spectrogram)\n        for layer, (film_shift, film_scale) in zip(self.ublocks, reversed(shift_and_scale)):\n            x = layer(x, film_shift, film_scale)\n        x = self.out_conv(x)\n        return x\n\n    def load_noise_schedule(self, path):\n        beta = np.load(path, allow_pickle=True).item()[\"beta\"]  # pylint: disable=unexpected-keyword-arg\n        self.compute_noise_level(beta)\n\n    @torch.no_grad()\n    def inference(self, x, y_n=None):\n        \"\"\"\n        Shapes:\n            x: :math:`[B, C , T]`\n            y_n: :math:`[B, 1, T]`\n        \"\"\"\n        if y_n is None:\n            y_n = torch.randn(x.shape[0], 1, self.hop_len * x.shape[-1])\n        else:\n            y_n = torch.FloatTensor(y_n).unsqueeze(0).unsqueeze(0)\n        y_n = y_n.type_as(x)\n        sqrt_alpha_hat = self.noise_level.to(x)\n        for n in range(len(self.alpha) - 1, -1, -1):\n            y_n = self.c1[n] * (y_n - self.c2[n] * self.forward(y_n, x, sqrt_alpha_hat[n].repeat(x.shape[0])))\n            if n > 0:\n                z = torch.randn_like(y_n)\n                y_n += self.sigma[n - 1] * z\n            y_n.clamp_(-1.0, 1.0)\n        return y_n\n\n    def compute_y_n(self, y_0):\n        \"\"\"Compute noisy audio based on noise schedule\"\"\"\n        self.noise_level = self.noise_level.to(y_0)\n        if len(y_0.shape) == 3:\n            y_0 = y_0.squeeze(1)\n        s = torch.randint(0, self.num_steps - 1, [y_0.shape[0]])\n        l_a, l_b = self.noise_level[s], self.noise_level[s + 1]\n        noise_scale = l_a + torch.rand(y_0.shape[0]).to(y_0) * (l_b - l_a)\n        noise_scale = noise_scale.unsqueeze(1)\n        noise = torch.randn_like(y_0)\n        noisy_audio = noise_scale * y_0 + (1.0 - noise_scale**2) ** 0.5 * noise\n        return noise.unsqueeze(1), noisy_audio.unsqueeze(1), noise_scale[:, 0]\n\n    def compute_noise_level(self, beta):\n        \"\"\"Compute noise schedule parameters\"\"\"\n        self.num_steps = len(beta)\n        alpha = 1 - beta\n        alpha_hat = np.cumprod(alpha)\n        noise_level = np.concatenate([[1.0], alpha_hat**0.5], axis=0)\n        noise_level = alpha_hat**0.5\n\n        # pylint: disable=not-callable\n        self.beta = torch.tensor(beta.astype(np.float32))\n        self.alpha = torch.tensor(alpha.astype(np.float32))\n        self.alpha_hat = torch.tensor(alpha_hat.astype(np.float32))\n        self.noise_level = torch.tensor(noise_level.astype(np.float32))\n\n        self.c1 = 1 / self.alpha**0.5\n        self.c2 = (1 - self.alpha) / (1 - self.alpha_hat) ** 0.5\n        self.sigma = ((1.0 - self.alpha_hat[:-1]) / (1.0 - self.alpha_hat[1:]) * self.beta[1:]) ** 0.5\n\n    def remove_weight_norm(self):\n        for _, layer in enumerate(self.dblocks):\n            if len(layer.state_dict()) != 0:\n                try:\n                    nn.utils.remove_weight_norm(layer)\n                except ValueError:\n                    layer.remove_weight_norm()\n\n        for _, layer in enumerate(self.film):\n            if len(layer.state_dict()) != 0:\n                try:\n                    nn.utils.remove_weight_norm(layer)\n                except ValueError:\n                    layer.remove_weight_norm()\n\n        for _, layer in enumerate(self.ublocks):\n            if len(layer.state_dict()) != 0:\n                try:\n                    nn.utils.remove_weight_norm(layer)\n                except ValueError:\n                    layer.remove_weight_norm()\n\n        nn.utils.remove_weight_norm(self.x_conv)\n        nn.utils.remove_weight_norm(self.out_conv)\n        nn.utils.remove_weight_norm(self.y_conv)\n\n    def apply_weight_norm(self):\n        for _, layer in enumerate(self.dblocks):\n            if len(layer.state_dict()) != 0:\n                layer.apply_weight_norm()\n\n        for _, layer in enumerate(self.film):\n            if len(layer.state_dict()) != 0:\n                layer.apply_weight_norm()\n\n        for _, layer in enumerate(self.ublocks):\n            if len(layer.state_dict()) != 0:\n                layer.apply_weight_norm()\n\n        self.x_conv = weight_norm(self.x_conv)\n        self.out_conv = weight_norm(self.out_conv)\n        self.y_conv = weight_norm(self.y_conv)\n\n    def load_checkpoint(\n        self, config, checkpoint_path, eval=False, cache=False\n    ):  # pylint: disable=unused-argument, redefined-builtin\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), cache=cache)\n        self.load_state_dict(state[\"model\"])\n        if eval:\n            self.eval()\n            assert not self.training\n            if self.config.model_params.use_weight_norm:\n                self.remove_weight_norm()\n            betas = np.linspace(\n                config[\"test_noise_schedule\"][\"min_val\"],\n                config[\"test_noise_schedule\"][\"max_val\"],\n                config[\"test_noise_schedule\"][\"num_steps\"],\n            )\n            self.compute_noise_level(betas)\n        else:\n            betas = np.linspace(\n                config[\"train_noise_schedule\"][\"min_val\"],\n                config[\"train_noise_schedule\"][\"max_val\"],\n                config[\"train_noise_schedule\"][\"num_steps\"],\n            )\n            self.compute_noise_level(betas)\n\n    def train_step(self, batch: Dict, criterion: Dict) -> Tuple[Dict, Dict]:\n        # format data\n        x = batch[\"input\"]\n        y = batch[\"waveform\"]\n\n        # set noise scale\n        noise, x_noisy, noise_scale = self.compute_y_n(y)\n\n        # forward pass\n        noise_hat = self.forward(x_noisy, x, noise_scale)\n\n        # compute losses\n        loss = criterion(noise, noise_hat)\n        return {\"model_output\": noise_hat}, {\"loss\": loss}\n\n    def train_log(  # pylint: disable=no-self-use\n        self, batch: Dict, outputs: Dict, logger: \"Logger\", assets: Dict, steps: int  # pylint: disable=unused-argument\n    ) -> Tuple[Dict, np.ndarray]:\n        pass\n\n    @torch.no_grad()\n    def eval_step(self, batch: Dict, criterion: nn.Module) -> Tuple[Dict, Dict]:\n        return self.train_step(batch, criterion)\n\n    def eval_log(  # pylint: disable=no-self-use\n        self, batch: Dict, outputs: Dict, logger: \"Logger\", assets: Dict, steps: int  # pylint: disable=unused-argument\n    ) -> None:\n        pass\n\n    def test(self, assets: Dict, test_loader: \"DataLoader\", outputs=None):  # pylint: disable=unused-argument\n        # setup noise schedule and inference\n        ap = assets[\"audio_processor\"]\n        noise_schedule = self.config[\"test_noise_schedule\"]\n        betas = np.linspace(noise_schedule[\"min_val\"], noise_schedule[\"max_val\"], noise_schedule[\"num_steps\"])\n        self.compute_noise_level(betas)\n        samples = test_loader.dataset.load_test_samples(1)\n        for sample in samples:\n            x = sample[0]\n            x = x[None, :, :].to(next(self.parameters()).device)\n            y = sample[1]\n            y = y[None, :]\n            # compute voice\n            y_pred = self.inference(x)\n            # compute spectrograms\n            figures = plot_results(y_pred, y, ap, \"test\")\n            # Sample audio\n            sample_voice = y_pred[0].squeeze(0).detach().cpu().numpy()\n        return figures, {\"test/audio\": sample_voice}\n\n    def get_optimizer(self):\n        return get_optimizer(self.config.optimizer, self.config.optimizer_params, self.config.lr, self)\n\n    def get_scheduler(self, optimizer):\n        return get_scheduler(self.config.lr_scheduler, self.config.lr_scheduler_params, optimizer)\n\n    @staticmethod\n    def get_criterion():\n        return torch.nn.L1Loss()\n\n    @staticmethod\n    def format_batch(batch: Dict) -> Dict:\n        # return a whole audio segment\n        m, y = batch[0], batch[1]\n        y = y.unsqueeze(1)\n        return {\"input\": m, \"waveform\": y}\n\n    def get_data_loader(self, config: Coqpit, assets: Dict, is_eval: True, samples: List, verbose: bool, num_gpus: int):\n        ap = assets[\"audio_processor\"]\n        dataset = WaveGradDataset(\n            ap=ap,\n            items=samples,\n            seq_len=self.config.seq_len,\n            hop_len=ap.hop_length,\n            pad_short=self.config.pad_short,\n            conv_pad=self.config.conv_pad,\n            is_training=not is_eval,\n            return_segments=True,\n            use_noise_augment=False,\n            use_cache=config.use_cache,\n            verbose=verbose,\n        )\n        sampler = DistributedSampler(dataset) if num_gpus > 1 else None\n        loader = DataLoader(\n            dataset,\n            batch_size=self.config.batch_size,\n            shuffle=num_gpus <= 1,\n            drop_last=False,\n            sampler=sampler,\n            num_workers=self.config.num_eval_loader_workers if is_eval else self.config.num_loader_workers,\n            pin_memory=False,\n        )\n        return loader\n\n    def on_epoch_start(self, trainer):  # pylint: disable=unused-argument\n        noise_schedule = self.config[\"train_noise_schedule\"]\n        betas = np.linspace(noise_schedule[\"min_val\"], noise_schedule[\"max_val\"], noise_schedule[\"num_steps\"])\n        self.compute_noise_level(betas)\n\n    @staticmethod\n    def init_from_config(config: \"WavegradConfig\"):\n        return Wavegrad(config)\n"
  },
  {
    "path": "TTS/vocoder/models/wavernn.py",
    "content": "import sys\nimport time\nfrom dataclasses import dataclass, field\nfrom typing import Dict, List, Tuple\n\nimport numpy as np\nimport torch\nimport torch.nn.functional as F\nfrom coqpit import Coqpit\nfrom torch import nn\nfrom torch.utils.data import DataLoader\nfrom torch.utils.data.distributed import DistributedSampler\n\nfrom TTS.tts.utils.visual import plot_spectrogram\nfrom TTS.utils.audio import AudioProcessor\nfrom TTS.utils.io import load_fsspec\nfrom TTS.vocoder.datasets.wavernn_dataset import WaveRNNDataset\nfrom TTS.vocoder.layers.losses import WaveRNNLoss\nfrom TTS.vocoder.models.base_vocoder import BaseVocoder\nfrom TTS.vocoder.utils.distribution import sample_from_discretized_mix_logistic, sample_from_gaussian\n\n\ndef stream(string, variables):\n    sys.stdout.write(f\"\\r{string}\" % variables)\n\n\n# pylint: disable=abstract-method\n# relates https://github.com/pytorch/pytorch/issues/42305\nclass ResBlock(nn.Module):\n    def __init__(self, dims):\n        super().__init__()\n        self.conv1 = nn.Conv1d(dims, dims, kernel_size=1, bias=False)\n        self.conv2 = nn.Conv1d(dims, dims, kernel_size=1, bias=False)\n        self.batch_norm1 = nn.BatchNorm1d(dims)\n        self.batch_norm2 = nn.BatchNorm1d(dims)\n\n    def forward(self, x):\n        residual = x\n        x = self.conv1(x)\n        x = self.batch_norm1(x)\n        x = F.relu(x)\n        x = self.conv2(x)\n        x = self.batch_norm2(x)\n        return x + residual\n\n\nclass MelResNet(nn.Module):\n    def __init__(self, num_res_blocks, in_dims, compute_dims, res_out_dims, pad):\n        super().__init__()\n        k_size = pad * 2 + 1\n        self.conv_in = nn.Conv1d(in_dims, compute_dims, kernel_size=k_size, bias=False)\n        self.batch_norm = nn.BatchNorm1d(compute_dims)\n        self.layers = nn.ModuleList()\n        for _ in range(num_res_blocks):\n            self.layers.append(ResBlock(compute_dims))\n        self.conv_out = nn.Conv1d(compute_dims, res_out_dims, kernel_size=1)\n\n    def forward(self, x):\n        x = self.conv_in(x)\n        x = self.batch_norm(x)\n        x = F.relu(x)\n        for f in self.layers:\n            x = f(x)\n        x = self.conv_out(x)\n        return x\n\n\nclass Stretch2d(nn.Module):\n    def __init__(self, x_scale, y_scale):\n        super().__init__()\n        self.x_scale = x_scale\n        self.y_scale = y_scale\n\n    def forward(self, x):\n        b, c, h, w = x.size()\n        x = x.unsqueeze(-1).unsqueeze(3)\n        x = x.repeat(1, 1, 1, self.y_scale, 1, self.x_scale)\n        return x.view(b, c, h * self.y_scale, w * self.x_scale)\n\n\nclass UpsampleNetwork(nn.Module):\n    def __init__(\n        self,\n        feat_dims,\n        upsample_scales,\n        compute_dims,\n        num_res_blocks,\n        res_out_dims,\n        pad,\n        use_aux_net,\n    ):\n        super().__init__()\n        self.total_scale = np.cumproduct(upsample_scales)[-1]\n        self.indent = pad * self.total_scale\n        self.use_aux_net = use_aux_net\n        if use_aux_net:\n            self.resnet = MelResNet(num_res_blocks, feat_dims, compute_dims, res_out_dims, pad)\n            self.resnet_stretch = Stretch2d(self.total_scale, 1)\n        self.up_layers = nn.ModuleList()\n        for scale in upsample_scales:\n            k_size = (1, scale * 2 + 1)\n            padding = (0, scale)\n            stretch = Stretch2d(scale, 1)\n            conv = nn.Conv2d(1, 1, kernel_size=k_size, padding=padding, bias=False)\n            conv.weight.data.fill_(1.0 / k_size[1])\n            self.up_layers.append(stretch)\n            self.up_layers.append(conv)\n\n    def forward(self, m):\n        if self.use_aux_net:\n            aux = self.resnet(m).unsqueeze(1)\n            aux = self.resnet_stretch(aux)\n            aux = aux.squeeze(1)\n            aux = aux.transpose(1, 2)\n        else:\n            aux = None\n        m = m.unsqueeze(1)\n        for f in self.up_layers:\n            m = f(m)\n        m = m.squeeze(1)[:, :, self.indent : -self.indent]\n        return m.transpose(1, 2), aux\n\n\nclass Upsample(nn.Module):\n    def __init__(self, scale, pad, num_res_blocks, feat_dims, compute_dims, res_out_dims, use_aux_net):\n        super().__init__()\n        self.scale = scale\n        self.pad = pad\n        self.indent = pad * scale\n        self.use_aux_net = use_aux_net\n        self.resnet = MelResNet(num_res_blocks, feat_dims, compute_dims, res_out_dims, pad)\n\n    def forward(self, m):\n        if self.use_aux_net:\n            aux = self.resnet(m)\n            aux = torch.nn.functional.interpolate(aux, scale_factor=self.scale, mode=\"linear\", align_corners=True)\n            aux = aux.transpose(1, 2)\n        else:\n            aux = None\n        m = torch.nn.functional.interpolate(m, scale_factor=self.scale, mode=\"linear\", align_corners=True)\n        m = m[:, :, self.indent : -self.indent]\n        m = m * 0.045  # empirically found\n\n        return m.transpose(1, 2), aux\n\n\n@dataclass\nclass WavernnArgs(Coqpit):\n    \"\"\"🐸 WaveRNN model arguments.\n\n    rnn_dims (int):\n        Number of hidden channels in RNN layers. Defaults to 512.\n    fc_dims (int):\n        Number of hidden channels in fully-conntected layers. Defaults to 512.\n    compute_dims (int):\n        Number of hidden channels in the feature ResNet. Defaults to 128.\n    res_out_dim (int):\n        Number of hidden channels in the feature ResNet output. Defaults to 128.\n    num_res_blocks (int):\n        Number of residual blocks in the ResNet. Defaults to 10.\n    use_aux_net (bool):\n        enable/disable the feature ResNet. Defaults to True.\n    use_upsample_net (bool):\n        enable/ disable the upsampling networl. If False, basic upsampling is used. Defaults to True.\n    upsample_factors (list):\n        Upsampling factors. The multiply of the values must match the `hop_length`. Defaults to ```[4, 8, 8]```.\n    mode (str):\n        Output mode of the WaveRNN vocoder. `mold` for Mixture of Logistic Distribution, `gauss` for a single\n        Gaussian Distribution and `bits` for quantized bits as the model's output.\n    mulaw (bool):\n        enable / disable the use of Mulaw quantization for training. Only applicable if `mode == 'bits'`. Defaults\n        to `True`.\n    pad (int):\n            Padding applied to the input feature frames against the convolution layers of the feature network.\n            Defaults to 2.\n    \"\"\"\n\n    rnn_dims: int = 512\n    fc_dims: int = 512\n    compute_dims: int = 128\n    res_out_dims: int = 128\n    num_res_blocks: int = 10\n    use_aux_net: bool = True\n    use_upsample_net: bool = True\n    upsample_factors: List[int] = field(default_factory=lambda: [4, 8, 8])\n    mode: str = \"mold\"  # mold [string], gauss [string], bits [int]\n    mulaw: bool = True  # apply mulaw if mode is bits\n    pad: int = 2\n    feat_dims: int = 80\n\n\nclass Wavernn(BaseVocoder):\n    def __init__(self, config: Coqpit):\n        \"\"\"🐸 WaveRNN model.\n        Original paper - https://arxiv.org/abs/1802.08435\n        Official implementation - https://github.com/fatchord/WaveRNN\n\n        Args:\n            config (Coqpit): [description]\n\n        Raises:\n            RuntimeError: [description]\n\n        Examples:\n            >>> from TTS.vocoder.configs import WavernnConfig\n            >>> config = WavernnConfig()\n            >>> model = Wavernn(config)\n\n        Paper Abstract:\n            Sequential models achieve state-of-the-art results in audio, visual and textual domains with respect to\n            both estimating the data distribution and generating high-quality samples. Efficient sampling for this\n            class of models has however remained an elusive problem. With a focus on text-to-speech synthesis, we\n            describe a set of general techniques for reducing sampling time while maintaining high output quality.\n            We first describe a single-layer recurrent neural network, the WaveRNN, with a dual softmax layer that\n            matches the quality of the state-of-the-art WaveNet model. The compact form of the network makes it\n            possible to generate 24kHz 16-bit audio 4x faster than real time on a GPU. Second, we apply a weight\n            pruning technique to reduce the number of weights in the WaveRNN. We find that, for a constant number of\n            parameters, large sparse networks perform better than small dense networks and this relationship holds for\n            sparsity levels beyond 96%. The small number of weights in a Sparse WaveRNN makes it possible to sample\n            high-fidelity audio on a mobile CPU in real time. Finally, we propose a new generation scheme based on\n            subscaling that folds a long sequence into a batch of shorter sequences and allows one to generate multiple\n            samples at once. The Subscale WaveRNN produces 16 samples per step without loss of quality and offers an\n            orthogonal method for increasing sampling efficiency.\n        \"\"\"\n        super().__init__(config)\n\n        if isinstance(self.args.mode, int):\n            self.n_classes = 2**self.args.mode\n        elif self.args.mode == \"mold\":\n            self.n_classes = 3 * 10\n        elif self.args.mode == \"gauss\":\n            self.n_classes = 2\n        else:\n            raise RuntimeError(\"Unknown model mode value - \", self.args.mode)\n\n        self.ap = AudioProcessor(**config.audio.to_dict())\n        self.aux_dims = self.args.res_out_dims // 4\n\n        if self.args.use_upsample_net:\n            assert (\n                np.cumproduct(self.args.upsample_factors)[-1] == config.audio.hop_length\n            ), \" [!] upsample scales needs to be equal to hop_length\"\n            self.upsample = UpsampleNetwork(\n                self.args.feat_dims,\n                self.args.upsample_factors,\n                self.args.compute_dims,\n                self.args.num_res_blocks,\n                self.args.res_out_dims,\n                self.args.pad,\n                self.args.use_aux_net,\n            )\n        else:\n            self.upsample = Upsample(\n                config.audio.hop_length,\n                self.args.pad,\n                self.args.num_res_blocks,\n                self.args.feat_dims,\n                self.args.compute_dims,\n                self.args.res_out_dims,\n                self.args.use_aux_net,\n            )\n        if self.args.use_aux_net:\n            self.I = nn.Linear(self.args.feat_dims + self.aux_dims + 1, self.args.rnn_dims)\n            self.rnn1 = nn.GRU(self.args.rnn_dims, self.args.rnn_dims, batch_first=True)\n            self.rnn2 = nn.GRU(self.args.rnn_dims + self.aux_dims, self.args.rnn_dims, batch_first=True)\n            self.fc1 = nn.Linear(self.args.rnn_dims + self.aux_dims, self.args.fc_dims)\n            self.fc2 = nn.Linear(self.args.fc_dims + self.aux_dims, self.args.fc_dims)\n            self.fc3 = nn.Linear(self.args.fc_dims, self.n_classes)\n        else:\n            self.I = nn.Linear(self.args.feat_dims + 1, self.args.rnn_dims)\n            self.rnn1 = nn.GRU(self.args.rnn_dims, self.args.rnn_dims, batch_first=True)\n            self.rnn2 = nn.GRU(self.args.rnn_dims, self.args.rnn_dims, batch_first=True)\n            self.fc1 = nn.Linear(self.args.rnn_dims, self.args.fc_dims)\n            self.fc2 = nn.Linear(self.args.fc_dims, self.args.fc_dims)\n            self.fc3 = nn.Linear(self.args.fc_dims, self.n_classes)\n\n    def forward(self, x, mels):\n        bsize = x.size(0)\n        h1 = torch.zeros(1, bsize, self.args.rnn_dims).to(x.device)\n        h2 = torch.zeros(1, bsize, self.args.rnn_dims).to(x.device)\n        mels, aux = self.upsample(mels)\n\n        if self.args.use_aux_net:\n            aux_idx = [self.aux_dims * i for i in range(5)]\n            a1 = aux[:, :, aux_idx[0] : aux_idx[1]]\n            a2 = aux[:, :, aux_idx[1] : aux_idx[2]]\n            a3 = aux[:, :, aux_idx[2] : aux_idx[3]]\n            a4 = aux[:, :, aux_idx[3] : aux_idx[4]]\n\n        x = (\n            torch.cat([x.unsqueeze(-1), mels, a1], dim=2)\n            if self.args.use_aux_net\n            else torch.cat([x.unsqueeze(-1), mels], dim=2)\n        )\n        x = self.I(x)\n        res = x\n        self.rnn1.flatten_parameters()\n        x, _ = self.rnn1(x, h1)\n\n        x = x + res\n        res = x\n        x = torch.cat([x, a2], dim=2) if self.args.use_aux_net else x\n        self.rnn2.flatten_parameters()\n        x, _ = self.rnn2(x, h2)\n\n        x = x + res\n        x = torch.cat([x, a3], dim=2) if self.args.use_aux_net else x\n        x = F.relu(self.fc1(x))\n\n        x = torch.cat([x, a4], dim=2) if self.args.use_aux_net else x\n        x = F.relu(self.fc2(x))\n        return self.fc3(x)\n\n    def inference(self, mels, batched=None, target=None, overlap=None):\n        self.eval()\n        output = []\n        start = time.time()\n        rnn1 = self.get_gru_cell(self.rnn1)\n        rnn2 = self.get_gru_cell(self.rnn2)\n\n        with torch.no_grad():\n            if isinstance(mels, np.ndarray):\n                mels = torch.FloatTensor(mels).to(str(next(self.parameters()).device))\n\n            if mels.ndim == 2:\n                mels = mels.unsqueeze(0)\n            wave_len = (mels.size(-1) - 1) * self.config.audio.hop_length\n\n            mels = self.pad_tensor(mels.transpose(1, 2), pad=self.args.pad, side=\"both\")\n            mels, aux = self.upsample(mels.transpose(1, 2))\n\n            if batched:\n                mels = self.fold_with_overlap(mels, target, overlap)\n                if aux is not None:\n                    aux = self.fold_with_overlap(aux, target, overlap)\n\n            b_size, seq_len, _ = mels.size()\n\n            h1 = torch.zeros(b_size, self.args.rnn_dims).type_as(mels)\n            h2 = torch.zeros(b_size, self.args.rnn_dims).type_as(mels)\n            x = torch.zeros(b_size, 1).type_as(mels)\n\n            if self.args.use_aux_net:\n                d = self.aux_dims\n                aux_split = [aux[:, :, d * i : d * (i + 1)] for i in range(4)]\n\n            for i in range(seq_len):\n                m_t = mels[:, i, :]\n\n                if self.args.use_aux_net:\n                    a1_t, a2_t, a3_t, a4_t = (a[:, i, :] for a in aux_split)\n\n                x = torch.cat([x, m_t, a1_t], dim=1) if self.args.use_aux_net else torch.cat([x, m_t], dim=1)\n                x = self.I(x)\n                h1 = rnn1(x, h1)\n\n                x = x + h1\n                inp = torch.cat([x, a2_t], dim=1) if self.args.use_aux_net else x\n                h2 = rnn2(inp, h2)\n\n                x = x + h2\n                x = torch.cat([x, a3_t], dim=1) if self.args.use_aux_net else x\n                x = F.relu(self.fc1(x))\n\n                x = torch.cat([x, a4_t], dim=1) if self.args.use_aux_net else x\n                x = F.relu(self.fc2(x))\n\n                logits = self.fc3(x)\n\n                if self.args.mode == \"mold\":\n                    sample = sample_from_discretized_mix_logistic(logits.unsqueeze(0).transpose(1, 2))\n                    output.append(sample.view(-1))\n                    x = sample.transpose(0, 1).type_as(mels)\n                elif self.args.mode == \"gauss\":\n                    sample = sample_from_gaussian(logits.unsqueeze(0).transpose(1, 2))\n                    output.append(sample.view(-1))\n                    x = sample.transpose(0, 1).type_as(mels)\n                elif isinstance(self.args.mode, int):\n                    posterior = F.softmax(logits, dim=1)\n                    distrib = torch.distributions.Categorical(posterior)\n\n                    sample = 2 * distrib.sample().float() / (self.n_classes - 1.0) - 1.0\n                    output.append(sample)\n                    x = sample.unsqueeze(-1)\n                else:\n                    raise RuntimeError(\"Unknown model mode value - \", self.args.mode)\n\n                if i % 100 == 0:\n                    self.gen_display(i, seq_len, b_size, start)\n\n        output = torch.stack(output).transpose(0, 1)\n        output = output.cpu()\n        if batched:\n            output = output.numpy()\n            output = output.astype(np.float64)\n\n            output = self.xfade_and_unfold(output, target, overlap)\n        else:\n            output = output[0]\n\n        if self.args.mulaw and isinstance(self.args.mode, int):\n            output = AudioProcessor.mulaw_decode(output, self.args.mode)\n\n        # Fade-out at the end to avoid signal cutting out suddenly\n        fade_out = np.linspace(1, 0, 20 * self.config.audio.hop_length)\n        output = output[:wave_len]\n\n        if wave_len > len(fade_out):\n            output[-20 * self.config.audio.hop_length :] *= fade_out\n\n        self.train()\n        return output\n\n    def gen_display(self, i, seq_len, b_size, start):\n        gen_rate = (i + 1) / (time.time() - start) * b_size / 1000\n        realtime_ratio = gen_rate * 1000 / self.config.audio.sample_rate\n        stream(\n            \"%i/%i -- batch_size: %i -- gen_rate: %.1f kHz -- x_realtime: %.1f  \",\n            (i * b_size, seq_len * b_size, b_size, gen_rate, realtime_ratio),\n        )\n\n    def fold_with_overlap(self, x, target, overlap):\n        \"\"\"Fold the tensor with overlap for quick batched inference.\n            Overlap will be used for crossfading in xfade_and_unfold()\n        Args:\n            x (tensor)    : Upsampled conditioning features.\n                            shape=(1, timesteps, features)\n            target (int)  : Target timesteps for each index of batch\n            overlap (int) : Timesteps for both xfade and rnn warmup\n        Return:\n            (tensor) : shape=(num_folds, target + 2 * overlap, features)\n        Details:\n            x = [[h1, h2, ... hn]]\n            Where each h is a vector of conditioning features\n            Eg: target=2, overlap=1 with x.size(1)=10\n            folded = [[h1, h2, h3, h4],\n                      [h4, h5, h6, h7],\n                      [h7, h8, h9, h10]]\n        \"\"\"\n\n        _, total_len, features = x.size()\n\n        # Calculate variables needed\n        num_folds = (total_len - overlap) // (target + overlap)\n        extended_len = num_folds * (overlap + target) + overlap\n        remaining = total_len - extended_len\n\n        # Pad if some time steps poking out\n        if remaining != 0:\n            num_folds += 1\n            padding = target + 2 * overlap - remaining\n            x = self.pad_tensor(x, padding, side=\"after\")\n\n        folded = torch.zeros(num_folds, target + 2 * overlap, features).to(x.device)\n\n        # Get the values for the folded tensor\n        for i in range(num_folds):\n            start = i * (target + overlap)\n            end = start + target + 2 * overlap\n            folded[i] = x[:, start:end, :]\n\n        return folded\n\n    @staticmethod\n    def get_gru_cell(gru):\n        gru_cell = nn.GRUCell(gru.input_size, gru.hidden_size)\n        gru_cell.weight_hh.data = gru.weight_hh_l0.data\n        gru_cell.weight_ih.data = gru.weight_ih_l0.data\n        gru_cell.bias_hh.data = gru.bias_hh_l0.data\n        gru_cell.bias_ih.data = gru.bias_ih_l0.data\n        return gru_cell\n\n    @staticmethod\n    def pad_tensor(x, pad, side=\"both\"):\n        # NB - this is just a quick method i need right now\n        # i.e., it won't generalise to other shapes/dims\n        b, t, c = x.size()\n        total = t + 2 * pad if side == \"both\" else t + pad\n        padded = torch.zeros(b, total, c).to(x.device)\n        if side in (\"before\", \"both\"):\n            padded[:, pad : pad + t, :] = x\n        elif side == \"after\":\n            padded[:, :t, :] = x\n        return padded\n\n    @staticmethod\n    def xfade_and_unfold(y, target, overlap):\n        \"\"\"Applies a crossfade and unfolds into a 1d array.\n        Args:\n            y (ndarry)    : Batched sequences of audio samples\n                            shape=(num_folds, target + 2 * overlap)\n                            dtype=np.float64\n            overlap (int) : Timesteps for both xfade and rnn warmup\n        Return:\n            (ndarry) : audio samples in a 1d array\n                       shape=(total_len)\n                       dtype=np.float64\n        Details:\n            y = [[seq1],\n                 [seq2],\n                 [seq3]]\n            Apply a gain envelope at both ends of the sequences\n            y = [[seq1_in, seq1_target, seq1_out],\n                 [seq2_in, seq2_target, seq2_out],\n                 [seq3_in, seq3_target, seq3_out]]\n            Stagger and add up the groups of samples:\n            [seq1_in, seq1_target, (seq1_out + seq2_in), seq2_target, ...]\n        \"\"\"\n\n        num_folds, length = y.shape\n        target = length - 2 * overlap\n        total_len = num_folds * (target + overlap) + overlap\n\n        # Need some silence for the rnn warmup\n        silence_len = overlap // 2\n        fade_len = overlap - silence_len\n        silence = np.zeros((silence_len), dtype=np.float64)\n\n        # Equal power crossfade\n        t = np.linspace(-1, 1, fade_len, dtype=np.float64)\n        fade_in = np.sqrt(0.5 * (1 + t))\n        fade_out = np.sqrt(0.5 * (1 - t))\n\n        # Concat the silence to the fades\n        fade_in = np.concatenate([silence, fade_in])\n        fade_out = np.concatenate([fade_out, silence])\n\n        # Apply the gain to the overlap samples\n        y[:, :overlap] *= fade_in\n        y[:, -overlap:] *= fade_out\n\n        unfolded = np.zeros((total_len), dtype=np.float64)\n\n        # Loop to add up all the samples\n        for i in range(num_folds):\n            start = i * (target + overlap)\n            end = start + target + 2 * overlap\n            unfolded[start:end] += y[i]\n\n        return unfolded\n\n    def load_checkpoint(\n        self, config, checkpoint_path, eval=False, cache=False\n    ):  # pylint: disable=unused-argument, redefined-builtin\n        state = load_fsspec(checkpoint_path, map_location=torch.device(\"cpu\"), cache=cache)\n        self.load_state_dict(state[\"model\"])\n        if eval:\n            self.eval()\n            assert not self.training\n\n    def train_step(self, batch: Dict, criterion: Dict) -> Tuple[Dict, Dict]:\n        mels = batch[\"input\"]\n        waveform = batch[\"waveform\"]\n        waveform_coarse = batch[\"waveform_coarse\"]\n\n        y_hat = self.forward(waveform, mels)\n        if isinstance(self.args.mode, int):\n            y_hat = y_hat.transpose(1, 2).unsqueeze(-1)\n        else:\n            waveform_coarse = waveform_coarse.float()\n        waveform_coarse = waveform_coarse.unsqueeze(-1)\n        # compute losses\n        loss_dict = criterion(y_hat, waveform_coarse)\n        return {\"model_output\": y_hat}, loss_dict\n\n    def eval_step(self, batch: Dict, criterion: Dict) -> Tuple[Dict, Dict]:\n        return self.train_step(batch, criterion)\n\n    @torch.no_grad()\n    def test(\n        self, assets: Dict, test_loader: \"DataLoader\", output: Dict  # pylint: disable=unused-argument\n    ) -> Tuple[Dict, Dict]:\n        ap = self.ap\n        figures = {}\n        audios = {}\n        samples = test_loader.dataset.load_test_samples(1)\n        for idx, sample in enumerate(samples):\n            x = torch.FloatTensor(sample[0])\n            x = x.to(next(self.parameters()).device)\n            y_hat = self.inference(x, self.config.batched, self.config.target_samples, self.config.overlap_samples)\n            x_hat = ap.melspectrogram(y_hat)\n            figures.update(\n                {\n                    f\"test_{idx}/ground_truth\": plot_spectrogram(x.T),\n                    f\"test_{idx}/prediction\": plot_spectrogram(x_hat.T),\n                }\n            )\n            audios.update({f\"test_{idx}/audio\": y_hat})\n            # audios.update({f\"real_{idx}/audio\": y_hat})\n        return figures, audios\n\n    def test_log(\n        self, outputs: Dict, logger: \"Logger\", assets: Dict, steps: int  # pylint: disable=unused-argument\n    ) -> Tuple[Dict, np.ndarray]:\n        figures, audios = outputs\n        logger.eval_figures(steps, figures)\n        logger.eval_audios(steps, audios, self.ap.sample_rate)\n\n    @staticmethod\n    def format_batch(batch: Dict) -> Dict:\n        waveform = batch[0]\n        mels = batch[1]\n        waveform_coarse = batch[2]\n        return {\"input\": mels, \"waveform\": waveform, \"waveform_coarse\": waveform_coarse}\n\n    def get_data_loader(  # pylint: disable=no-self-use\n        self,\n        config: Coqpit,\n        assets: Dict,\n        is_eval: True,\n        samples: List,\n        verbose: bool,\n        num_gpus: int,\n    ):\n        ap = self.ap\n        dataset = WaveRNNDataset(\n            ap=ap,\n            items=samples,\n            seq_len=config.seq_len,\n            hop_len=ap.hop_length,\n            pad=config.model_args.pad,\n            mode=config.model_args.mode,\n            mulaw=config.model_args.mulaw,\n            is_training=not is_eval,\n            verbose=verbose,\n        )\n        sampler = DistributedSampler(dataset, shuffle=True) if num_gpus > 1 else None\n        loader = DataLoader(\n            dataset,\n            batch_size=1 if is_eval else config.batch_size,\n            shuffle=num_gpus == 0,\n            collate_fn=dataset.collate,\n            sampler=sampler,\n            num_workers=config.num_eval_loader_workers if is_eval else config.num_loader_workers,\n            pin_memory=True,\n        )\n        return loader\n\n    def get_criterion(self):\n        # define train functions\n        return WaveRNNLoss(self.args.mode)\n\n    @staticmethod\n    def init_from_config(config: \"WavernnConfig\"):\n        return Wavernn(config)\n"
  },
  {
    "path": "TTS/vocoder/utils/__init__.py",
    "content": ""
  },
  {
    "path": "TTS/vocoder/utils/distribution.py",
    "content": "import math\n\nimport numpy as np\nimport torch\nimport torch.nn.functional as F\nfrom torch.distributions.normal import Normal\n\n\ndef gaussian_loss(y_hat, y, log_std_min=-7.0):\n    assert y_hat.dim() == 3\n    assert y_hat.size(2) == 2\n    mean = y_hat[:, :, :1]\n    log_std = torch.clamp(y_hat[:, :, 1:], min=log_std_min)\n    # TODO: replace with pytorch dist\n    log_probs = -0.5 * (-math.log(2.0 * math.pi) - 2.0 * log_std - torch.pow(y - mean, 2) * torch.exp((-2.0 * log_std)))\n    return log_probs.squeeze().mean()\n\n\ndef sample_from_gaussian(y_hat, log_std_min=-7.0, scale_factor=1.0):\n    assert y_hat.size(2) == 2\n    mean = y_hat[:, :, :1]\n    log_std = torch.clamp(y_hat[:, :, 1:], min=log_std_min)\n    dist = Normal(\n        mean,\n        torch.exp(log_std),\n    )\n    sample = dist.sample()\n    sample = torch.clamp(torch.clamp(sample, min=-scale_factor), max=scale_factor)\n    del dist\n    return sample\n\n\ndef log_sum_exp(x):\n    \"\"\"numerically stable log_sum_exp implementation that prevents overflow\"\"\"\n    # TF ordering\n    axis = len(x.size()) - 1\n    m, _ = torch.max(x, dim=axis)\n    m2, _ = torch.max(x, dim=axis, keepdim=True)\n    return m + torch.log(torch.sum(torch.exp(x - m2), dim=axis))\n\n\n# It is adapted from https://github.com/r9y9/wavenet_vocoder/blob/master/wavenet_vocoder/mixture.py\ndef discretized_mix_logistic_loss(y_hat, y, num_classes=65536, log_scale_min=None, reduce=True):\n    if log_scale_min is None:\n        log_scale_min = float(np.log(1e-14))\n    y_hat = y_hat.permute(0, 2, 1)\n    assert y_hat.dim() == 3\n    assert y_hat.size(1) % 3 == 0\n    nr_mix = y_hat.size(1) // 3\n\n    # (B x T x C)\n    y_hat = y_hat.transpose(1, 2)\n\n    # unpack parameters. (B, T, num_mixtures) x 3\n    logit_probs = y_hat[:, :, :nr_mix]\n    means = y_hat[:, :, nr_mix : 2 * nr_mix]\n    log_scales = torch.clamp(y_hat[:, :, 2 * nr_mix : 3 * nr_mix], min=log_scale_min)\n\n    # B x T x 1 -> B x T x num_mixtures\n    y = y.expand_as(means)\n\n    centered_y = y - means\n    inv_stdv = torch.exp(-log_scales)\n    plus_in = inv_stdv * (centered_y + 1.0 / (num_classes - 1))\n    cdf_plus = torch.sigmoid(plus_in)\n    min_in = inv_stdv * (centered_y - 1.0 / (num_classes - 1))\n    cdf_min = torch.sigmoid(min_in)\n\n    # log probability for edge case of 0 (before scaling)\n    # equivalent: torch.log(F.sigmoid(plus_in))\n    log_cdf_plus = plus_in - F.softplus(plus_in)\n\n    # log probability for edge case of 255 (before scaling)\n    # equivalent: (1 - F.sigmoid(min_in)).log()\n    log_one_minus_cdf_min = -F.softplus(min_in)\n\n    # probability for all other cases\n    cdf_delta = cdf_plus - cdf_min\n\n    mid_in = inv_stdv * centered_y\n    # log probability in the center of the bin, to be used in extreme cases\n    # (not actually used in our code)\n    log_pdf_mid = mid_in - log_scales - 2.0 * F.softplus(mid_in)\n\n    # tf equivalent\n\n    # log_probs = tf.where(x < -0.999, log_cdf_plus,\n    #                      tf.where(x > 0.999, log_one_minus_cdf_min,\n    #                               tf.where(cdf_delta > 1e-5,\n    #                                        tf.log(tf.maximum(cdf_delta, 1e-12)),\n    #                                        log_pdf_mid - np.log(127.5))))\n\n    # TODO: cdf_delta <= 1e-5 actually can happen. How can we choose the value\n    # for num_classes=65536 case? 1e-7? not sure..\n    inner_inner_cond = (cdf_delta > 1e-5).float()\n\n    inner_inner_out = inner_inner_cond * torch.log(torch.clamp(cdf_delta, min=1e-12)) + (1.0 - inner_inner_cond) * (\n        log_pdf_mid - np.log((num_classes - 1) / 2)\n    )\n    inner_cond = (y > 0.999).float()\n    inner_out = inner_cond * log_one_minus_cdf_min + (1.0 - inner_cond) * inner_inner_out\n    cond = (y < -0.999).float()\n    log_probs = cond * log_cdf_plus + (1.0 - cond) * inner_out\n\n    log_probs = log_probs + F.log_softmax(logit_probs, -1)\n\n    if reduce:\n        return -torch.mean(log_sum_exp(log_probs))\n    return -log_sum_exp(log_probs).unsqueeze(-1)\n\n\ndef sample_from_discretized_mix_logistic(y, log_scale_min=None):\n    \"\"\"\n    Sample from discretized mixture of logistic distributions\n    Args:\n        y (Tensor): :math:`[B, C, T]`\n        log_scale_min (float): Log scale minimum value\n    Returns:\n        Tensor: sample in range of [-1, 1].\n    \"\"\"\n    if log_scale_min is None:\n        log_scale_min = float(np.log(1e-14))\n    assert y.size(1) % 3 == 0\n    nr_mix = y.size(1) // 3\n\n    # B x T x C\n    y = y.transpose(1, 2)\n    logit_probs = y[:, :, :nr_mix]\n\n    # sample mixture indicator from softmax\n    temp = logit_probs.data.new(logit_probs.size()).uniform_(1e-5, 1.0 - 1e-5)\n    temp = logit_probs.data - torch.log(-torch.log(temp))\n    _, argmax = temp.max(dim=-1)\n\n    # (B, T) -> (B, T, nr_mix)\n    one_hot = to_one_hot(argmax, nr_mix)\n    # select logistic parameters\n    means = torch.sum(y[:, :, nr_mix : 2 * nr_mix] * one_hot, dim=-1)\n    log_scales = torch.clamp(torch.sum(y[:, :, 2 * nr_mix : 3 * nr_mix] * one_hot, dim=-1), min=log_scale_min)\n    # sample from logistic & clip to interval\n    # we don't actually round to the nearest 8bit value when sampling\n    u = means.data.new(means.size()).uniform_(1e-5, 1.0 - 1e-5)\n    x = means + torch.exp(log_scales) * (torch.log(u) - torch.log(1.0 - u))\n\n    x = torch.clamp(torch.clamp(x, min=-1.0), max=1.0)\n\n    return x\n\n\ndef to_one_hot(tensor, n, fill_with=1.0):\n    # we perform one hot encore with respect to the last axis\n    one_hot = torch.FloatTensor(tensor.size() + (n,)).zero_().type_as(tensor)\n    one_hot.scatter_(len(tensor.size()), tensor.unsqueeze(-1), fill_with)\n    return one_hot\n"
  },
  {
    "path": "TTS/vocoder/utils/generic_utils.py",
    "content": "from typing import Dict\n\nimport numpy as np\nimport torch\nfrom matplotlib import pyplot as plt\n\nfrom TTS.tts.utils.visual import plot_spectrogram\nfrom TTS.utils.audio import AudioProcessor\n\n\ndef interpolate_vocoder_input(scale_factor, spec):\n    \"\"\"Interpolate spectrogram by the scale factor.\n    It is mainly used to match the sampling rates of\n    the tts and vocoder models.\n\n    Args:\n        scale_factor (float): scale factor to interpolate the spectrogram\n        spec (np.array): spectrogram to be interpolated\n\n    Returns:\n        torch.tensor: interpolated spectrogram.\n    \"\"\"\n    print(\" > before interpolation :\", spec.shape)\n    spec = torch.tensor(spec).unsqueeze(0).unsqueeze(0)  # pylint: disable=not-callable\n    spec = torch.nn.functional.interpolate(\n        spec, scale_factor=scale_factor, recompute_scale_factor=True, mode=\"bilinear\", align_corners=False\n    ).squeeze(0)\n    print(\" > after interpolation :\", spec.shape)\n    return spec\n\n\ndef plot_results(y_hat: torch.tensor, y: torch.tensor, ap: AudioProcessor, name_prefix: str = None) -> Dict:\n    \"\"\"Plot the predicted and the real waveform and their spectrograms.\n\n    Args:\n        y_hat (torch.tensor): Predicted waveform.\n        y (torch.tensor): Real waveform.\n        ap (AudioProcessor): Audio processor used to process the waveform.\n        name_prefix (str, optional): Name prefix used to name the figures. Defaults to None.\n\n    Returns:\n        Dict: output figures keyed by the name of the figures.\n    \"\"\" \"\"\"Plot vocoder model results\"\"\"\n    if name_prefix is None:\n        name_prefix = \"\"\n\n    # select an instance from batch\n    y_hat = y_hat[0].squeeze().detach().cpu().numpy()\n    y = y[0].squeeze().detach().cpu().numpy()\n\n    spec_fake = ap.melspectrogram(y_hat).T\n    spec_real = ap.melspectrogram(y).T\n    spec_diff = np.abs(spec_fake - spec_real)\n\n    # plot figure and save it\n    fig_wave = plt.figure()\n    plt.subplot(2, 1, 1)\n    plt.plot(y)\n    plt.title(\"groundtruth speech\")\n    plt.subplot(2, 1, 2)\n    plt.plot(y_hat)\n    plt.title(\"generated speech\")\n    plt.tight_layout()\n    plt.close()\n\n    figures = {\n        name_prefix + \"spectrogram/fake\": plot_spectrogram(spec_fake),\n        name_prefix + \"spectrogram/real\": plot_spectrogram(spec_real),\n        name_prefix + \"spectrogram/diff\": plot_spectrogram(spec_diff),\n        name_prefix + \"speech_comparison\": fig_wave,\n    }\n    return figures\n"
  },
  {
    "path": "TTS_additional_material/.gitignore",
    "content": "WadaSNR/\n.idea/\n*.pyc\n.DS_Store\n./__init__.py\n# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\nwheels/\n*.egg-info/\n.installed.cfg\n*.egg\nMANIFEST\n\n# PyInstaller\n#  Usually these files are written by a python script from a template\n#  before PyInstaller builds the exe, so as to inject date/other infos into it.\n*.manifest\n*.spec\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Unit test / coverage reports\nhtmlcov/\n.tox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*.cover\n.hypothesis/\n\n# Translations\n*.mo\n*.pot\n\n# Django stuff:\n*.log\n.static_storage/\n.media/\nlocal_settings.py\n\n# Flask stuff:\ninstance/\n.webassets-cache\n\n# Scrapy stuff:\n.scrapy\n\n# Sphinx documentation\ndocs/_build/\n\n# PyBuilder\ntarget/\n\n# Jupyter Notebook\n.ipynb_checkpoints\n\n# pyenv\n.python-version\n\n# celery beat schedule file\ncelerybeat-schedule\n\n# SageMath parsed files\n*.sage.py\n\n# Environments\n.env\n.venv\nenv/\nvenv/\nENV/\nenv.bak/\nvenv.bak/\n\n# Spyder project settings\n.spyderproject\n.spyproject\n\n# Rope project settings\n.ropeproject\n\n# mkdocs documentation\n/site\n\n# mypy\n.mypy_cache/\n\n# vim\n*.swp\n*.swm\n*.swn\n*.swo\n\n# pytorch models\n*.pth\n*.pth.tar\n!dummy_speakers.pth\nresult/\n\n# setup.py\nversion.py\n\n# jupyter dummy files\ncore\n\n# ignore local datasets\nrecipes/WIP/*\nrecipes/ljspeech/LJSpeech-1.1/*\nrecipes/vctk/VCTK/*\nrecipes/**/*.npy\nrecipes/**/*.json\nVCTK-Corpus-removed-silence/*\n\n# ignore training logs\ntrainer_*_log.txt\n\n# files used internally for dev, test etc.\ntests/outputs/*\ntests/train_outputs/*\nTODO.txt\n.vscode/*\ndata/*\nnotebooks/data/*\nTTS/tts/utils/monotonic_align/core.c\n.vscode-upload.json\ntemp_build/*\nevents.out*\nold_configs/*\nmodel_importers/*\nmodel_profiling/*\ndocs/source/TODO/*\n.noseids\n.dccache\nlog.txt\numap.png\n*.out\nSocialMedia.txt\noutput.wav\ntts_output.wav\ndeps.json\nspeakers.json\ninternal/*\n*_pitch.npy\n*_phoneme.npy\nwandb\ndepot/*\ncoqui_recipes/*\nlocal_scripts/*\n"
  },
  {
    "path": "TTS_additional_material/.pre-commit-config.yaml",
    "content": "repos:\n  - repo: 'https://github.com/pre-commit/pre-commit-hooks'\n    rev: v2.3.0\n    hooks:\n      - id: check-yaml\n      - id: end-of-file-fixer\n      - id: trailing-whitespace\n  - repo: 'https://github.com/psf/black'\n    rev: 22.3.0\n    hooks:\n      - id: black\n        language_version: python3\n  - repo: https://github.com/pycqa/isort\n    rev: 5.8.0\n    hooks:\n      - id: isort\n        name: isort (python)\n      - id: isort\n        name: isort (cython)\n        types: [cython]\n      - id: isort\n        name: isort (pyi)\n        types: [pyi]\n  - repo: https://github.com/pycqa/pylint\n    rev: v2.8.2\n    hooks:\n    -   id: pylint\n"
  },
  {
    "path": "TTS_additional_material/.pylintrc",
    "content": "[MASTER]\n\n# A comma-separated list of package or module names from where C extensions may\n# be loaded. Extensions are loading into the active Python interpreter and may\n# run arbitrary code.\nextension-pkg-whitelist=\n\n# Add files or directories to the blacklist. They should be base names, not\n# paths.\nignore=CVS\n\n# Add files or directories matching the regex patterns to the blacklist. The\n# regex matches against base names, not paths.\nignore-patterns=\n\n# Python code to execute, usually for sys.path manipulation such as\n# pygtk.require().\n#init-hook=\n\n# Use multiple processes to speed up Pylint. Specifying 0 will auto-detect the\n# number of processors available to use.\njobs=1\n\n# Control the amount of potential inferred values when inferring a single\n# object. This can help the performance when dealing with large functions or\n# complex, nested conditions.\nlimit-inference-results=100\n\n# List of plugins (as comma separated values of python modules names) to load,\n# usually to register additional checkers.\nload-plugins=\n\n# Pickle collected data for later comparisons.\npersistent=yes\n\n# Specify a configuration file.\n#rcfile=\n\n# When enabled, pylint would attempt to guess common misconfiguration and emit\n# user-friendly hints instead of false-positive error messages.\nsuggestion-mode=yes\n\n# Allow loading of arbitrary C extensions. Extensions are imported into the\n# active Python interpreter and may run arbitrary code.\nunsafe-load-any-extension=no\n\n\n[MESSAGES CONTROL]\n\n# Only show warnings with the listed confidence levels. Leave empty to show\n# all. Valid levels: HIGH, INFERENCE, INFERENCE_FAILURE, UNDEFINED.\nconfidence=\n\n# Disable the message, report, category or checker with the given id(s). You\n# can either give multiple identifiers separated by comma (,) or put this\n# option multiple times (only on the command line, not in the configuration\n# file where it should appear only once). You can also use \"--disable=all\" to\n# disable everything first and then reenable specific checks. For example, if\n# you want to run only the similarities checker, you can use \"--disable=all\n# --enable=similarities\". If you want to run only the classes checker, but have\n# no Warning level messages displayed, use \"--disable=all --enable=classes\n# --disable=W\".\ndisable=missing-docstring,\n        too-many-public-methods,\n        too-many-lines,\n        bare-except,\n        ## for avoiding weird p3.6 CI linter error\n        ## TODO: see later if we can remove this\n        assigning-non-slot,\n        unsupported-assignment-operation,\n        ## end\n        line-too-long,\n        fixme,\n        wrong-import-order,\n        ungrouped-imports,\n        wrong-import-position,\n        import-error,\n        invalid-name,\n        too-many-instance-attributes,\n        arguments-differ,\n        arguments-renamed,\n        no-name-in-module,\n        no-member,\n        unsubscriptable-object,\n        print-statement,\n        parameter-unpacking,\n        unpacking-in-except,\n        old-raise-syntax,\n        backtick,\n        long-suffix,\n        old-ne-operator,\n        old-octal-literal,\n        import-star-module-level,\n        non-ascii-bytes-literal,\n        raw-checker-failed,\n        bad-inline-option,\n        locally-disabled,\n        file-ignored,\n        suppressed-message,\n        useless-suppression,\n        deprecated-pragma,\n        use-symbolic-message-instead,\n        useless-object-inheritance,\n        too-few-public-methods,\n        too-many-branches,\n        too-many-arguments,\n        too-many-locals,\n        too-many-statements,\n        apply-builtin,\n        basestring-builtin,\n        buffer-builtin,\n        cmp-builtin,\n        coerce-builtin,\n        execfile-builtin,\n        file-builtin,\n        long-builtin,\n        raw_input-builtin,\n        reduce-builtin,\n        standarderror-builtin,\n        unicode-builtin,\n        xrange-builtin,\n        coerce-method,\n        delslice-method,\n        getslice-method,\n        setslice-method,\n        no-absolute-import,\n        old-division,\n        dict-iter-method,\n        dict-view-method,\n        next-method-called,\n        metaclass-assignment,\n        indexing-exception,\n        raising-string,\n        reload-builtin,\n        oct-method,\n        hex-method,\n        nonzero-method,\n        cmp-method,\n        input-builtin,\n        round-builtin,\n        intern-builtin,\n        unichr-builtin,\n        map-builtin-not-iterating,\n        zip-builtin-not-iterating,\n        range-builtin-not-iterating,\n        filter-builtin-not-iterating,\n        using-cmp-argument,\n        eq-without-hash,\n        div-method,\n        idiv-method,\n        rdiv-method,\n        exception-message-attribute,\n        invalid-str-codec,\n        sys-max-int,\n        bad-python3-import,\n        deprecated-string-function,\n        deprecated-str-translate-call,\n        deprecated-itertools-function,\n        deprecated-types-field,\n        next-method-defined,\n        dict-items-not-iterating,\n        dict-keys-not-iterating,\n        dict-values-not-iterating,\n        deprecated-operator-function,\n        deprecated-urllib-function,\n        xreadlines-attribute,\n        deprecated-sys-function,\n        exception-escape,\n        comprehension-escape,\n        duplicate-code,\n        not-callable,\n        import-outside-toplevel\n\n# Enable the message, report, category or checker with the given id(s). You can\n# either give multiple identifier separated by comma (,) or put this option\n# multiple time (only on the command line, not in the configuration file where\n# it should appear only once). See also the \"--disable\" option for examples.\nenable=c-extension-no-member\n\n\n[REPORTS]\n\n# Python expression which should return a note less than 10 (10 is the highest\n# note). You have access to the variables errors warning, statement which\n# respectively contain the number of errors / warnings messages and the total\n# number of statements analyzed. This is used by the global evaluation report\n# (RP0004).\nevaluation=10.0 - ((float(5 * error + warning + refactor + convention) / statement) * 10)\n\n# Template used to display messages. This is a python new-style format string\n# used to format the message information. See doc for all details.\n#msg-template=\n\n# Set the output format. Available formats are text, parseable, colorized, json\n# and msvs (visual studio). You can also give a reporter class, e.g.\n# mypackage.mymodule.MyReporterClass.\noutput-format=text\n\n# Tells whether to display a full report or only the messages.\nreports=no\n\n# Activate the evaluation score.\nscore=yes\n\n\n[REFACTORING]\n\n# Maximum number of nested blocks for function / method body\nmax-nested-blocks=5\n\n# Complete name of functions that never returns. When checking for\n# inconsistent-return-statements if a never returning function is called then\n# it will be considered as an explicit return statement and no message will be\n# printed.\nnever-returning-functions=sys.exit\n\n\n[LOGGING]\n\n# Format style used to check logging format string. `old` means using %\n# formatting, while `new` is for `{}` formatting.\nlogging-format-style=old\n\n# Logging modules to check that the string format arguments are in logging\n# function parameter format.\nlogging-modules=logging\n\n\n[SPELLING]\n\n# Limits count of emitted suggestions for spelling mistakes.\nmax-spelling-suggestions=4\n\n# Spelling dictionary name. Available dictionaries: none. To make it working\n# install python-enchant package..\nspelling-dict=\n\n# List of comma separated words that should not be checked.\nspelling-ignore-words=\n\n# A path to a file that contains private dictionary; one word per line.\nspelling-private-dict-file=\n\n# Tells whether to store unknown words to indicated private dictionary in\n# --spelling-private-dict-file option instead of raising a message.\nspelling-store-unknown-words=no\n\n\n[MISCELLANEOUS]\n\n# List of note tags to take in consideration, separated by a comma.\nnotes=FIXME,\n      XXX,\n      TODO\n\n\n[TYPECHECK]\n\n# List of decorators that produce context managers, such as\n# contextlib.contextmanager. Add to this list to register other decorators that\n# produce valid context managers.\ncontextmanager-decorators=contextlib.contextmanager\n\n# List of members which are set dynamically and missed by pylint inference\n# system, and so shouldn't trigger E1101 when accessed. Python regular\n# expressions are accepted.\ngenerated-members=numpy.*,torch.*\n\n# Tells whether missing members accessed in mixin class should be ignored. A\n# mixin class is detected if its name ends with \"mixin\" (case insensitive).\nignore-mixin-members=yes\n\n# Tells whether to warn about missing members when the owner of the attribute\n# is inferred to be None.\nignore-none=yes\n\n# This flag controls whether pylint should warn about no-member and similar\n# checks whenever an opaque object is returned when inferring. The inference\n# can return multiple potential results while evaluating a Python object, but\n# some branches might not be evaluated, which results in partial inference. In\n# that case, it might be useful to still emit no-member and other checks for\n# the rest of the inferred objects.\nignore-on-opaque-inference=yes\n\n# List of class names for which member attributes should not be checked (useful\n# for classes with dynamically set attributes). This supports the use of\n# qualified names.\nignored-classes=optparse.Values,thread._local,_thread._local\n\n# List of module names for which member attributes should not be checked\n# (useful for modules/projects where namespaces are manipulated during runtime\n# and thus existing member attributes cannot be deduced by static analysis. It\n# supports qualified module names, as well as Unix pattern matching.\nignored-modules=\n\n# Show a hint with possible names when a member name was not found. The aspect\n# of finding the hint is based on edit distance.\nmissing-member-hint=yes\n\n# The minimum edit distance a name should have in order to be considered a\n# similar match for a missing member name.\nmissing-member-hint-distance=1\n\n# The total number of similar names that should be taken in consideration when\n# showing a hint for a missing member.\nmissing-member-max-choices=1\n\n\n[VARIABLES]\n\n# List of additional names supposed to be defined in builtins. Remember that\n# you should avoid defining new builtins when possible.\nadditional-builtins=\n\n# Tells whether unused global variables should be treated as a violation.\nallow-global-unused-variables=yes\n\n# List of strings which can identify a callback function by name. A callback\n# name must start or end with one of those strings.\ncallbacks=cb_,\n          _cb\n\n# A regular expression matching the name of dummy variables (i.e. expected to\n# not be used).\ndummy-variables-rgx=_+$|(_[a-zA-Z0-9_]*[a-zA-Z0-9]+?$)|dummy|^ignored_|^unused_\n\n# Argument names that match this expression will be ignored. Default to name\n# with leading underscore.\nignored-argument-names=_.*|^ignored_|^unused_\n\n# Tells whether we should check for unused import in __init__ files.\ninit-import=no\n\n# List of qualified module names which can have objects that can redefine\n# builtins.\nredefining-builtins-modules=six.moves,past.builtins,future.builtins,builtins,io\n\n\n[FORMAT]\n\n# Expected format of line ending, e.g. empty (any line ending), LF or CRLF.\nexpected-line-ending-format=\n\n# Regexp for a line that is allowed to be longer than the limit.\nignore-long-lines=^\\s*(# )?<?https?://\\S+>?$\n\n# Number of spaces of indent required inside a hanging or continued line.\nindent-after-paren=4\n\n# String used as indentation unit. This is usually \"    \" (4 spaces) or \"\\t\" (1\n# tab).\nindent-string='    '\n\n# Maximum number of characters on a single line.\nmax-line-length=120\n\n# Maximum number of lines in a module.\nmax-module-lines=1000\n\n# List of optional constructs for which whitespace checking is disabled. `dict-\n# separator` is used to allow tabulation in dicts, etc.: {1  : 1,\\n222: 2}.\n# `trailing-comma` allows a space between comma and closing bracket: (a, ).\n# `empty-line` allows space-only lines.\nno-space-check=trailing-comma,\n               dict-separator\n\n# Allow the body of a class to be on the same line as the declaration if body\n# contains single statement.\nsingle-line-class-stmt=no\n\n# Allow the body of an if to be on the same line as the test if there is no\n# else.\nsingle-line-if-stmt=no\n\n\n[SIMILARITIES]\n\n# Ignore comments when computing similarities.\nignore-comments=yes\n\n# Ignore docstrings when computing similarities.\nignore-docstrings=yes\n\n# Ignore imports when computing similarities.\nignore-imports=no\n\n# Minimum lines number of a similarity.\nmin-similarity-lines=4\n\n\n[BASIC]\n\n# Naming style matching correct argument names.\nargument-naming-style=snake_case\n\n# Regular expression matching correct argument names. Overrides argument-\n# naming-style.\nargument-rgx=[a-z_][a-z0-9_]{0,30}$\n\n# Naming style matching correct attribute names.\nattr-naming-style=snake_case\n\n# Regular expression matching correct attribute names. Overrides attr-naming-\n# style.\n#attr-rgx=\n\n# Bad variable names which should always be refused, separated by a comma.\nbad-names=\n\n# Naming style matching correct class attribute names.\nclass-attribute-naming-style=any\n\n# Regular expression matching correct class attribute names. Overrides class-\n# attribute-naming-style.\n#class-attribute-rgx=\n\n# Naming style matching correct class names.\nclass-naming-style=PascalCase\n\n# Regular expression matching correct class names. Overrides class-naming-\n# style.\n#class-rgx=\n\n# Naming style matching correct constant names.\nconst-naming-style=UPPER_CASE\n\n# Regular expression matching correct constant names. Overrides const-naming-\n# style.\n#const-rgx=\n\n# Minimum line length for functions/classes that require docstrings, shorter\n# ones are exempt.\ndocstring-min-length=-1\n\n# Naming style matching correct function names.\nfunction-naming-style=snake_case\n\n# Regular expression matching correct function names. Overrides function-\n# naming-style.\n#function-rgx=\n\n# Good variable names which should always be accepted, separated by a comma.\ngood-names=i,\n           j,\n           k,\n           x,\n           ex,\n           Run,\n           _\n\n# Include a hint for the correct naming format with invalid-name.\ninclude-naming-hint=no\n\n# Naming style matching correct inline iteration names.\ninlinevar-naming-style=any\n\n# Regular expression matching correct inline iteration names. Overrides\n# inlinevar-naming-style.\n#inlinevar-rgx=\n\n# Naming style matching correct method names.\nmethod-naming-style=snake_case\n\n# Regular expression matching correct method names. Overrides method-naming-\n# style.\n#method-rgx=\n\n# Naming style matching correct module names.\nmodule-naming-style=snake_case\n\n# Regular expression matching correct module names. Overrides module-naming-\n# style.\n#module-rgx=\n\n# Colon-delimited sets of names that determine each other's naming style when\n# the name regexes allow several styles.\nname-group=\n\n# Regular expression which should only match function or class names that do\n# not require a docstring.\nno-docstring-rgx=^_\n\n# List of decorators that produce properties, such as abc.abstractproperty. Add\n# to this list to register other decorators that produce valid properties.\n# These decorators are taken in consideration only for invalid-name.\nproperty-classes=abc.abstractproperty\n\n# Naming style matching correct variable names.\nvariable-naming-style=snake_case\n\n# Regular expression matching correct variable names. Overrides variable-\n# naming-style.\nvariable-rgx=[a-z_][a-z0-9_]{0,30}$\n\n\n[STRING]\n\n# This flag controls whether the implicit-str-concat-in-sequence should\n# generate a warning on implicit string concatenation in sequences defined over\n# several lines.\ncheck-str-concat-over-line-jumps=no\n\n\n[IMPORTS]\n\n# Allow wildcard imports from modules that define __all__.\nallow-wildcard-with-all=no\n\n# Analyse import fallback blocks. This can be used to support both Python 2 and\n# 3 compatible code, which means that the block might have code that exists\n# only in one or another interpreter, leading to false positives when analysed.\nanalyse-fallback-blocks=no\n\n# Deprecated modules which should not be used, separated by a comma.\ndeprecated-modules=optparse,tkinter.tix\n\n# Create a graph of external dependencies in the given file (report RP0402 must\n# not be disabled).\next-import-graph=\n\n# Create a graph of every (i.e. internal and external) dependencies in the\n# given file (report RP0402 must not be disabled).\nimport-graph=\n\n# Create a graph of internal dependencies in the given file (report RP0402 must\n# not be disabled).\nint-import-graph=\n\n# Force import order to recognize a module as part of the standard\n# compatibility libraries.\nknown-standard-library=\n\n# Force import order to recognize a module as part of a third party library.\nknown-third-party=enchant\n\n\n[CLASSES]\n\n# List of method names used to declare (i.e. assign) instance attributes.\ndefining-attr-methods=__init__,\n                      __new__,\n                      setUp\n\n# List of member names, which should be excluded from the protected access\n# warning.\nexclude-protected=_asdict,\n                  _fields,\n                  _replace,\n                  _source,\n                  _make\n\n# List of valid names for the first argument in a class method.\nvalid-classmethod-first-arg=cls\n\n# List of valid names for the first argument in a metaclass class method.\nvalid-metaclass-classmethod-first-arg=cls\n\n\n[DESIGN]\n\n# Maximum number of arguments for function / method.\nmax-args=5\n\n# Maximum number of attributes for a class (see R0902).\nmax-attributes=7\n\n# Maximum number of boolean expressions in an if statement.\nmax-bool-expr=5\n\n# Maximum number of branch for function / method body.\nmax-branches=12\n\n# Maximum number of locals for function / method body.\nmax-locals=15\n\n# Maximum number of parents for a class (see R0901).\nmax-parents=15\n\n# Maximum number of public methods for a class (see R0904).\nmax-public-methods=20\n\n# Maximum number of return / yield for function / method body.\nmax-returns=6\n\n# Maximum number of statements in function / method body.\nmax-statements=50\n\n# Minimum number of public methods for a class (see R0903).\nmin-public-methods=2\n\n\n[EXCEPTIONS]\n\n# Exceptions that will emit a warning when being caught. Defaults to\n# \"BaseException, Exception\".\novergeneral-exceptions=BaseException,\n                       Exception\n"
  },
  {
    "path": "TTS_additional_material/.readthedocs.yml",
    "content": "# .readthedocs.yml\n# Read the Docs configuration file\n# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details\n\n# Required\nversion: 2\n\n# Build documentation in the docs/ directory with Sphinx\nsphinx:\n  builder: html\n  configuration: docs/source/conf.py\n\n# Optionally set the version of Python and requirements required to build your docs\npython:\n  version: 3.7\n  install:\n    - requirements: docs/requirements.txt\n    - requirements: requirements.txt"
  },
  {
    "path": "TTS_additional_material/CODE_OF_CONDUCT.md",
    "content": "\n# Contributor Covenant Code of Conduct\n\n## Our Pledge\n\nWe as members, contributors, and leaders pledge to make participation in our\ncommunity a harassment-free experience for everyone, regardless of age, body\nsize, visible or invisible disability, ethnicity, sex characteristics, gender\nidentity and expression, level of experience, education, socio-economic status,\nnationality, personal appearance, race, caste, color, religion, or sexual identity\nand orientation.\n\nWe pledge to act and interact in ways that contribute to an open, welcoming,\ndiverse, inclusive, and healthy community.\n\n## Our Standards\n\nExamples of behavior that contributes to a positive environment for our\ncommunity include:\n\n* Demonstrating empathy and kindness toward other people\n* Being respectful of differing opinions, viewpoints, and experiences\n* Giving and gracefully accepting constructive feedback\n* Accepting responsibility and apologizing to those affected by our mistakes,\n  and learning from the experience\n* Focusing on what is best not just for us as individuals, but for the\n  overall community\n\nExamples of unacceptable behavior include:\n\n* The use of sexualized language or imagery, and sexual attention or\n  advances of any kind\n* Trolling, insulting or derogatory comments, and personal or political attacks\n* Public or private harassment\n* Publishing others' private information, such as a physical or email\n  address, without their explicit permission\n* Other conduct which could reasonably be considered inappropriate in a\n  professional setting\n\n## Enforcement Responsibilities\n\nCommunity leaders are responsible for clarifying and enforcing our standards of\nacceptable behavior and will take appropriate and fair corrective action in\nresponse to any behavior that they deem inappropriate, threatening, offensive,\nor harmful.\n\nCommunity leaders have the right and responsibility to remove, edit, or reject\ncomments, commits, code, wiki edits, issues, and other contributions that are\nnot aligned to this Code of Conduct, and will communicate reasons for moderation\ndecisions when appropriate.\n\n## Scope\n\nThis Code of Conduct applies within all community spaces, and also applies when\nan individual is officially representing the community in public spaces.\nExamples of representing our community include using an official e-mail address,\nposting via an official social media account, or acting as an appointed\nrepresentative at an online or offline event.\n\n## Enforcement\n\nInstances of abusive, harassing, or otherwise unacceptable behavior may be\nreported to the community leaders responsible for enforcement at\ncoc-report@coqui.ai.\nAll complaints will be reviewed and investigated promptly and fairly.\n\nAll community leaders are obligated to respect the privacy and security of the\nreporter of any incident.\n\n## Enforcement Guidelines\n\nCommunity leaders will follow these Community Impact Guidelines in determining\nthe consequences for any action they deem in violation of this Code of Conduct:\n\n### 1. Correction\n\n**Community Impact**: Use of inappropriate language or other behavior deemed\nunprofessional or unwelcome in the community.\n\n**Consequence**: A private, written warning from community leaders, providing\nclarity around the nature of the violation and an explanation of why the\nbehavior was inappropriate. A public apology may be requested.\n\n### 2. Warning\n\n**Community Impact**: A violation through a single incident or series\nof actions.\n\n**Consequence**: A warning with consequences for continued behavior. No\ninteraction with the people involved, including unsolicited interaction with\nthose enforcing the Code of Conduct, for a specified period of time. This\nincludes avoiding interactions in community spaces as well as external channels\nlike social media. Violating these terms may lead to a temporary or\npermanent ban.\n\n### 3. Temporary Ban\n\n**Community Impact**: A serious violation of community standards, including\nsustained inappropriate behavior.\n\n**Consequence**: A temporary ban from any sort of interaction or public\ncommunication with the community for a specified period of time. No public or\nprivate interaction with the people involved, including unsolicited interaction\nwith those enforcing the Code of Conduct, is allowed during this period.\nViolating these terms may lead to a permanent ban.\n\n### 4. Permanent Ban\n\n**Community Impact**: Demonstrating a pattern of violation of community\nstandards, including sustained inappropriate behavior,  harassment of an\nindividual, or aggression toward or disparagement of classes of individuals.\n\n**Consequence**: A permanent ban from any sort of public interaction within\nthe community.\n\n## Attribution\n\nThis Code of Conduct is adapted from the [Contributor Covenant][homepage],\nversion 2.0, available at\n[https://www.contributor-covenant.org/version/2/0/code_of_conduct.html][v2.0].\n\nCommunity Impact Guidelines were inspired by \n[Mozilla's code of conduct enforcement ladder][Mozilla CoC].\n\nFor answers to common questions about this code of conduct, see the FAQ at\n[https://www.contributor-covenant.org/faq][FAQ]. Translations are available \nat [https://www.contributor-covenant.org/translations][translations].\n\n[homepage]: https://www.contributor-covenant.org\n[v2.0]: https://www.contributor-covenant.org/version/2/0/code_of_conduct.html\n[Mozilla CoC]: https://github.com/mozilla/diversity\n[FAQ]: https://www.contributor-covenant.org/faq\n[translations]: https://www.contributor-covenant.org/translations\n"
  },
  {
    "path": "TTS_additional_material/README.md",
    "content": "<img src=\"https://raw.githubusercontent.com/coqui-ai/TTS/main/images/coqui-log-green-TTS.png\" height=\"56\"/>\n\n----\n\n### 📣 Clone your voice with a single click on [🐸Coqui.ai](https://app.coqui.ai/auth/signin)\n\n----\n\n🐸TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality.\n🐸TTS comes with pretrained models, tools for measuring dataset quality and already used in **20+ languages** for products and research projects.\n\n[![Dicord](https://img.shields.io/discord/1037326658807533628?color=%239B59B6&label=chat%20on%20discord)](https://discord.gg/5eXr5seRrv)\n[![License](<https://img.shields.io/badge/License-MPL%202.0-brightgreen.svg>)](https://opensource.org/licenses/MPL-2.0)\n[![PyPI version](https://badge.fury.io/py/TTS.svg)](https://badge.fury.io/py/TTS)\n[![Covenant](https://camo.githubusercontent.com/7d620efaa3eac1c5b060ece5d6aacfcc8b81a74a04d05cd0398689c01c4463bb/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f6e7472696275746f72253230436f76656e616e742d76322e3025323061646f707465642d6666363962342e737667)](https://github.com/coqui-ai/TTS/blob/master/CODE_OF_CONDUCT.md)\n[![Downloads](https://pepy.tech/badge/tts)](https://pepy.tech/project/tts)\n[![DOI](https://zenodo.org/badge/265612440.svg)](https://zenodo.org/badge/latestdoi/265612440)\n\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/aux_tests.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/data_tests.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/docker.yaml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/inference_tests.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/style_check.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/text_tests.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/tts_tests.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/vocoder_tests.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/zoo_tests0.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/zoo_tests1.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/zoo_tests2.yml/badge.svg)\n[![Docs](<https://readthedocs.org/projects/tts/badge/?version=latest&style=plastic>)](https://tts.readthedocs.io/en/latest/)\n\n📰 [**Subscribe to 🐸Coqui.ai Newsletter**](https://coqui.ai/?subscription=true)\n\n📢 [English Voice Samples](https://erogol.github.io/ddc-samples/) and [SoundCloud playlist](https://soundcloud.com/user-565970875/pocket-article-wavernn-and-tacotron2)\n\n📄 [Text-to-Speech paper collection](https://github.com/erogol/TTS-papers)\n\n<img src=\"https://static.scarf.sh/a.png?x-pxid=cf317fe7-2188-4721-bc01-124bb5d5dbb2\" />\n\n## 💬 Where to ask questions\nPlease use our dedicated channels for questions and discussion. Help is much more valuable if it's shared publicly so that more people can benefit from it.\n\n| Type                            | Platforms                               |\n| ------------------------------- | --------------------------------------- |\n| 🚨 **Bug Reports**              | [GitHub Issue Tracker]                  |\n| 🎁 **Feature Requests & Ideas** | [GitHub Issue Tracker]                  |\n| 👩‍💻 **Usage Questions**          | [GitHub Discussions]                    |\n| 🗯 **General Discussion**       | [GitHub Discussions] or [Discord]   |\n\n[github issue tracker]: https://github.com/coqui-ai/tts/issues\n[github discussions]: https://github.com/coqui-ai/TTS/discussions\n[discord]: https://discord.gg/5eXr5seRrv\n[Tutorials and Examples]: https://github.com/coqui-ai/TTS/wiki/TTS-Notebooks-and-Tutorials\n\n\n## 🔗 Links and Resources\n| Type                            | Links                               |\n| ------------------------------- | --------------------------------------- |\n| 💼 **Documentation**              | [ReadTheDocs](https://tts.readthedocs.io/en/latest/)\n| 💾 **Installation**               | [TTS/README.md](https://github.com/coqui-ai/TTS/tree/dev#install-tts)|\n| 👩‍💻 **Contributing**               | [CONTRIBUTING.md](https://github.com/coqui-ai/TTS/blob/main/CONTRIBUTING.md)|\n| 📌 **Road Map**                   | [Main Development Plans](https://github.com/coqui-ai/TTS/issues/378)\n| 🚀 **Released Models**            | [TTS Releases](https://github.com/coqui-ai/TTS/releases) and [Experimental Models](https://github.com/coqui-ai/TTS/wiki/Experimental-Released-Models)|\n\n## 🥇 TTS Performance\n<p align=\"center\"><img src=\"https://raw.githubusercontent.com/coqui-ai/TTS/main/images/TTS-performance.png\" width=\"800\" /></p>\n\nUnderlined \"TTS*\" and \"Judy*\" are 🐸TTS models\n<!-- [Details...](https://github.com/coqui-ai/TTS/wiki/Mean-Opinion-Score-Results) -->\n\n## Features\n- High-performance Deep Learning models for Text2Speech tasks.\n    - Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech).\n    - Speaker Encoder to compute speaker embeddings efficiently.\n    - Vocoder models (MelGAN, Multiband-MelGAN, GAN-TTS, ParallelWaveGAN, WaveGrad, WaveRNN)\n- Fast and efficient model training.\n- Detailed training logs on the terminal and Tensorboard.\n- Support for Multi-speaker TTS.\n- Efficient, flexible, lightweight but feature complete `Trainer API`.\n- Released and ready-to-use models.\n- Tools to curate Text2Speech datasets under```dataset_analysis```.\n- Utilities to use and test your models.\n- Modular (but not too much) code base enabling easy implementation of new ideas.\n\n## Implemented Models\n### Spectrogram models\n- Tacotron: [paper](https://arxiv.org/abs/1703.10135)\n- Tacotron2: [paper](https://arxiv.org/abs/1712.05884)\n- Glow-TTS: [paper](https://arxiv.org/abs/2005.11129)\n- Speedy-Speech: [paper](https://arxiv.org/abs/2008.03802)\n- Align-TTS: [paper](https://arxiv.org/abs/2003.01950)\n- FastPitch: [paper](https://arxiv.org/pdf/2006.06873.pdf)\n- FastSpeech: [paper](https://arxiv.org/abs/1905.09263)\n- FastSpeech2: [paper](https://arxiv.org/abs/2006.04558)\n- SC-GlowTTS: [paper](https://arxiv.org/abs/2104.05557)\n- Capacitron: [paper](https://arxiv.org/abs/1906.03402)\n- OverFlow: [paper](https://arxiv.org/abs/2211.06892)\n- Neural HMM TTS: [paper](https://arxiv.org/abs/2108.13320)\n\n### End-to-End Models\n- VITS: [paper](https://arxiv.org/pdf/2106.06103)\n- YourTTS: [paper](https://arxiv.org/abs/2112.02418)\n\n### Attention Methods\n- Guided Attention: [paper](https://arxiv.org/abs/1710.08969)\n- Forward Backward Decoding: [paper](https://arxiv.org/abs/1907.09006)\n- Graves Attention: [paper](https://arxiv.org/abs/1910.10288)\n- Double Decoder Consistency: [blog](https://erogol.com/solving-attention-problems-of-tts-models-with-double-decoder-consistency/)\n- Dynamic Convolutional Attention: [paper](https://arxiv.org/pdf/1910.10288.pdf)\n- Alignment Network: [paper](https://arxiv.org/abs/2108.10447)\n\n### Speaker Encoder\n- GE2E: [paper](https://arxiv.org/abs/1710.10467)\n- Angular Loss: [paper](https://arxiv.org/pdf/2003.11982.pdf)\n\n### Vocoders\n- MelGAN: [paper](https://arxiv.org/abs/1910.06711)\n- MultiBandMelGAN: [paper](https://arxiv.org/abs/2005.05106)\n- ParallelWaveGAN: [paper](https://arxiv.org/abs/1910.11480)\n- GAN-TTS discriminators: [paper](https://arxiv.org/abs/1909.11646)\n- WaveRNN: [origin](https://github.com/fatchord/WaveRNN/)\n- WaveGrad: [paper](https://arxiv.org/abs/2009.00713)\n- HiFiGAN: [paper](https://arxiv.org/abs/2010.05646)\n- UnivNet: [paper](https://arxiv.org/abs/2106.07889)\n\nYou can also help us implement more models.\n\n## Install TTS\n🐸TTS is tested on Ubuntu 18.04 with **python >= 3.7, < 3.11.**.\n\nIf you are only interested in [synthesizing speech](https://tts.readthedocs.io/en/latest/inference.html) with the released 🐸TTS models, installing from PyPI is the easiest option.\n\n```bash\npip install TTS\n```\n\nIf you plan to code or train models, clone 🐸TTS and install it locally.\n\n```bash\ngit clone https://github.com/coqui-ai/TTS\npip install -e .[all,dev,notebooks]  # Select the relevant extras\n```\n\nIf you are on Ubuntu (Debian), you can also run following commands for installation.\n\n```bash\n$ make system-deps  # intended to be used on Ubuntu (Debian). Let us know if you have a different OS.\n$ make install\n```\n\nIf you are on Windows, 👑@GuyPaddock wrote installation instructions [here](https://stackoverflow.com/questions/66726331/how-can-i-run-mozilla-tts-coqui-tts-training-with-cuda-on-a-windows-system).\n\n\n## Docker Image\nYou can also try TTS without install with the docker image.\nSimply run the following command and you will be able to run TTS without installing it.\n\n```bash\ndocker run --rm -it -p 5002:5002 --entrypoint /bin/bash ghcr.io/coqui-ai/tts-cpu\npython3 TTS/server/server.py --list_models #To get the list of available models\npython3 TTS/server/server.py --model_name tts_models/en/vctk/vits # To start a server\n```\n\nYou can then enjoy the TTS server [here](http://[::1]:5002/)\nMore details about the docker images (like GPU support) can be found [here](https://tts.readthedocs.io/en/latest/docker_images.html)\n\n\n## Synthesizing speech by 🐸TTS\n\n### 🐍 Python API\n\n```python\nfrom TTS.api import TTS\n\n# Running a multi-speaker and multi-lingual model\n\n# List available 🐸TTS models and choose the first one\nmodel_name = TTS.list_models()[0]\n# Init TTS\ntts = TTS(model_name)\n# Run TTS\n# ❗ Since this model is multi-speaker and multi-lingual, we must set the target speaker and the language\n# Text to speech with a numpy output\nwav = tts.tts(\"This is a test! This is also a test!!\", speaker=tts.speakers[0], language=tts.languages[0])\n# Text to speech to a file\ntts.tts_to_file(text=\"Hello world!\", speaker=tts.speakers[0], language=tts.languages[0], file_path=\"output.wav\")\n\n# Running a single speaker model\n\n# Init TTS with the target model name\ntts = TTS(model_name=\"tts_models/de/thorsten/tacotron2-DDC\", progress_bar=False, gpu=False)\n# Run TTS\ntts.tts_to_file(text=\"Ich bin eine Testnachricht.\", file_path=OUTPUT_PATH)\n\n# Example voice cloning with YourTTS in English, French and Portuguese:\ntts = TTS(model_name=\"tts_models/multilingual/multi-dataset/your_tts\", progress_bar=False, gpu=True)\ntts.tts_to_file(\"This is voice cloning.\", speaker_wav=\"my/cloning/audio.wav\", language=\"en\", file_path=\"output.wav\")\ntts.tts_to_file(\"C'est le clonage de la voix.\", speaker_wav=\"my/cloning/audio.wav\", language=\"fr\", file_path=\"output.wav\")\ntts.tts_to_file(\"Isso é clonagem de voz.\", speaker_wav=\"my/cloning/audio.wav\", language=\"pt\", file_path=\"output.wav\")\n```\n\n### Command line `tts`\n#### Single Speaker Models\n\n- List provided models:\n\n    ```\n    $ tts --list_models\n    ```\n- Get model info (for both tts_models and vocoder_models):\n    - Query by type/name:\n        The model_info_by_name uses the name as it from the --list_models.\n        ```\n        $ tts --model_info_by_name \"<model_type>/<language>/<dataset>/<model_name>\"\n        ```\n        For example:\n\n        ```\n        $ tts --model_info_by_name tts_models/tr/common-voice/glow-tts\n        ```\n        ```\n        $ tts --model_info_by_name vocoder_models/en/ljspeech/hifigan_v2\n        ```\n    - Query by type/idx:\n        The model_query_idx uses the corresponding idx from --list_models.\n        ```\n        $ tts --model_info_by_idx \"<model_type>/<model_query_idx>\"\n        ```\n        For example:\n\n        ```\n        $ tts --model_info_by_idx tts_models/3\n        ```\n\n- Run TTS with default models:\n\n    ```\n    $ tts --text \"Text for TTS\" --out_path output/path/speech.wav\n    ```\n\n- Run a TTS model with its default vocoder model:\n\n    ```\n    $ tts --text \"Text for TTS\" --model_name \"<model_type>/<language>/<dataset>/<model_name>\" --out_path output/path/speech.wav\n    ```\n  For example:\n\n    ```\n    $ tts --text \"Text for TTS\" --model_name \"tts_models/en/ljspeech/glow-tts\" --out_path output/path/speech.wav\n    ```\n\n- Run with specific TTS and vocoder models from the list:\n\n    ```\n    $ tts --text \"Text for TTS\" --model_name \"<model_type>/<language>/<dataset>/<model_name>\" --vocoder_name \"<model_type>/<language>/<dataset>/<model_name>\" --out_path output/path/speech.wav\n    ```\n\n  For example:\n\n    ```\n    $ tts --text \"Text for TTS\" --model_name \"tts_models/en/ljspeech/glow-tts\" --vocoder_name \"vocoder_models/en/ljspeech/univnet\" --out_path output/path/speech.wav\n    ```\n\n\n- Run your own TTS model (Using Griffin-Lim Vocoder):\n\n    ```\n    $ tts --text \"Text for TTS\" --model_path path/to/model.pth --config_path path/to/config.json --out_path output/path/speech.wav\n    ```\n\n- Run your own TTS and Vocoder models:\n    ```\n    $ tts --text \"Text for TTS\" --model_path path/to/model.pth --config_path path/to/config.json --out_path output/path/speech.wav\n        --vocoder_path path/to/vocoder.pth --vocoder_config_path path/to/vocoder_config.json\n    ```\n\n#### Multi-speaker Models\n\n- List the available speakers and choose as <speaker_id> among them:\n\n    ```\n    $ tts --model_name \"<language>/<dataset>/<model_name>\"  --list_speaker_idxs\n    ```\n\n- Run the multi-speaker TTS model with the target speaker ID:\n\n    ```\n    $ tts --text \"Text for TTS.\" --out_path output/path/speech.wav --model_name \"<language>/<dataset>/<model_name>\"  --speaker_idx <speaker_id>\n    ```\n\n- Run your own multi-speaker TTS model:\n\n    ```\n    $ tts --text \"Text for TTS\" --out_path output/path/speech.wav --model_path path/to/model.pth --config_path path/to/config.json --speakers_file_path path/to/speaker.json --speaker_idx <speaker_id>\n    ```\n\n## Directory Structure\n```\n|- notebooks/       (Jupyter Notebooks for model evaluation, parameter selection and data analysis.)\n|- utils/           (common utilities.)\n|- TTS\n    |- bin/             (folder for all the executables.)\n      |- train*.py                  (train your target model.)\n      |- ...\n    |- tts/             (text to speech models)\n        |- layers/          (model layer definitions)\n        |- models/          (model definitions)\n        |- utils/           (model specific utilities.)\n    |- speaker_encoder/ (Speaker Encoder models.)\n        |- (same)\n    |- vocoder/         (Vocoder models.)\n        |- (same)\n```\n"
  },
  {
    "path": "TTS_additional_material/hubconf.py",
    "content": "dependencies = [\n    'torch', 'gdown', 'pysbd', 'gruut', 'anyascii', 'pypinyin', 'coqpit', 'mecab-python3', 'unidic-lite'\n]\nimport torch\n\nfrom TTS.utils.manage import ModelManager\nfrom TTS.utils.synthesizer import Synthesizer\n\n\ndef tts(model_name='tts_models/en/ljspeech/tacotron2-DCA',\n        vocoder_name=None,\n        use_cuda=False):\n    \"\"\"TTS entry point for PyTorch Hub that provides a Synthesizer object to synthesize speech from a give text.\n\n    Example:\n        >>> synthesizer = torch.hub.load('coqui-ai/TTS', 'tts', source='github')\n        >>> wavs = synthesizer.tts(\"This is a test! This is also a test!!\")\n            wavs - is a list of values of the synthesized speech.\n\n    Args:\n        model_name (str, optional): One of the model names from .model.json. Defaults to 'tts_models/en/ljspeech/tacotron2-DCA'.\n        vocoder_name (str, optional): One of the model names from .model.json. Defaults to 'vocoder_models/en/ljspeech/multiband-melgan'.\n        pretrained (bool, optional): [description]. Defaults to True.\n\n    Returns:\n        TTS.utils.synthesizer.Synthesizer: Synthesizer object wrapping both vocoder and tts models.\n    \"\"\"\n    manager = ModelManager()\n\n    model_path, config_path, model_item = manager.download_model(model_name)\n    vocoder_name = model_item[\n        'default_vocoder'] if vocoder_name is None else vocoder_name\n    vocoder_path, vocoder_config_path, _ = manager.download_model(vocoder_name)\n\n    # create synthesizer\n    synt = Synthesizer(tts_checkpoint=model_path,\n                       tts_config_path=config_path,\n                       vocoder_checkpoint=vocoder_path,\n                       vocoder_config=vocoder_config_path,\n                       use_cuda=use_cuda)\n    return synt\n\n\nif __name__ == '__main__':\n    synthesizer = torch.hub.load('coqui-ai/TTS:dev', 'tts', source='github')\n    synthesizer.tts(\"This is a test!\")\n"
  },
  {
    "path": "TTS_additional_material/requirements.txt",
    "content": "# core deps\nnumpy==1.21.6;python_version<\"3.10\"\nnumpy;python_version==\"3.10\"\ncython==0.29.28\nscipy>=1.4.0\ntorch>=1.7\ntorchaudio\nsoundfile\nlibrosa==0.8.0\nnumba==0.55.1;python_version<\"3.9\"\nnumba==0.56.4;python_version>=\"3.9\"\ninflect==5.6.0\ntqdm\nanyascii\npyyaml\nfsspec>=2021.04.0\naiohttp\npackaging\n# deps for examples\nflask\n# deps for inference\npysbd\n# deps for notebooks\numap-learn==0.5.1\npandas\n# deps for training\nmatplotlib\n# coqui stack\ntrainer==0.0.20\n# config management\ncoqpit>=0.0.16\n# chinese g2p deps\njieba\npypinyin\n# japanese g2p deps\nmecab-python3==1.0.5\nunidic-lite==1.0.8\n# gruut+supported langs\ngruut[de]==2.2.3\n# deps for korean\njamo\nnltk\ng2pkk>=0.1.1\n"
  },
  {
    "path": "UpdateHistory.md",
    "content": "# Update History\r\n---\r\n## JULY 14th 2023 UPDATE: Research Mode\r\nI can finnaly share the first draft of the Research Mode. This modality was thought for people often dealing with research papers. \r\n- Switch to research mode by saying *'Switch to Research Mode'*\r\n- :star: Initialize a new workspace like this: *'Initialize a new workspace about Carbon Fiber Applications in the Spacecraft industry'*. A workspace is a folder that collects and organize the results of the research. This protocol is subdivided into 3 sub-routines:\r\n   1. Core Paper identification: Use the **Semantic Scholar API** to identify some strongly relevant papers;\r\n   2. Core Expansion: for each paper, finds some suggestions, then keep only the suggestions that appear to be similar to at least 2 paper;\r\n   3. Refy Expansion: use the refy suggestion package to enlarge the results;\r\n- Find suggestions like: *'find suggestions that are sililar to the paper with title ...'*\r\n- Download: *'download the paper with title ...'*\r\n- :star: Query your database like: *'what is the author of the paper with title ...?'*  *'what are the experimental conditions set for the paper with title ...?'*\r\n\r\nPS: This mode is not super stable and needs to be worked on<br>\r\n\r\n*PPS: This project will be discontinued for some time since I'll be working on my thesis until 2024. However there are already so many things that can be improved so I'll be back!*\r\n\r\n---\r\n## APRIL 25th 2023 UPDATE: Jarvis can surf\r\nBy integrating LangChain into the project I am happy to bring some useful capabilities to Jarvis like accessing the web. Different LangChain Agents are now instructed to perform complex tasks like finding files, accessing the web, and extracting content from local resources...<br>\r\n - **Offline tasks**: the experience is now fully handled by AI so you don't need to guide jarvis in the tasks anymore. Earlier: *'Jarvis, find a file'* triggered the ```find_file``` function and then you had to provide keywords; now you can just say *'How many files are talking about X? Make a summary of one of them...'* and the system makes an action plan to satisfy your requests using different functions. The best part is that the agent in charge of this can realize if a file is relevant to the question by opening it and questioning itself.<br>\r\n - **Online tasks**: a different agent is instructed to surf the web searching for answers. These agents will answer a question like *'How s the weather like?'* or *'What is the latest news about stocks?'*<br>\r\n\r\nSo to summarize: Jarvis is in charge of keeping the conversation ongoing at a higher level and decides to delegate to the agents if needed; the offline agent puts its hands on files and PC stuff; the online agent surfs the web. I found this separation of tasks to work better especially with the main objective being to chat. If you make a conversational agent with tools the result is that it will mostly look for the answer online or among files consuming credit and time. \r\n<br>\r\n\r\nWhile this solution brings some stability to the conversation, it s not still ideal for processing scientific papers and the system does not produce any material yet. I am working on a sort of *project mode* that is focused on actions rather than chatting. Ideally I'd like jarvis to shift from *chat mode* to *project mode* when I'll be asking *\"help me to understand topic X from the paper Y\"* and then other similar papers should be downloaded, summarized and compared.\r\n\r\n---\r\n\r\n---\r\n## APRIL 15th 2023 UPDATE: Vicuna Integration and offline modality\r\nWorked hard to bring a local, free, alternative to OpenAI GPT models since lately some open-source competition is arising. Vicuna is a free GPT model that is claimed to be 90% as good as ChatGPT4. My plan is to **integrate** this model with ChatGPT rather than straight-out substitute it. This is because OpenAI API is more reliable, faster and doesn't seem to suffer from hallucinations (i.e. when a conversational AI generates a response to a prompt that is either false or irrelevant to the original request). The installation of this model is one-click but, since the model is hardware-dependant, the response time will vary according to your hardware capabilities. I made a more [in-depth analysis](https://github.com/gia-guar/JARVIS-ChatGPT/tree/main/Vicuna/README.md) on whether should you install it or just stick to OpenAI.<br>\r\nThe gist comes down to how much vRAM and RAM you have. The oogabooga ui backed is pretty efficient in dynamically allocating the models.\r\n- ! Move the files from the ```whisper_edits``` folder to the ```.venv\\lib\\site-packages\\whisper``` ! <span style=\"color:grey\"> this is needed to allocate better the whisper model btween GPU vRAM and RAM;</span><br> \r\n- Added measures to improve GPU memory allocation; see ```get_answer(optimize_cuda=Ture)``` <span style=\"color:grey\"> this is beta, I\r\n still need to integrate in a more built-in way with the rest of the scripts;</span>\r\n- Added OFFLINE toggle to run exclusively locally and avoid credit consumption <span style=\"color:grey\"> this is beta, I still need to integrate in a more built-in way with the rest of the scripts;</span>\r\n---\r\n---\r\n## APRIL 11th 2023 UPDATE: Overall improvement to search engine, update README.md\r\nnew: ```pip install argostranslate pvporcupine python-dotenv```\r\n- Upgrading to Python 3.8 and CUDA 11.7 (!)\r\n- Lately, the ```translator``` package was taking too long to work (~20 seconds to get a translation), so I added another translator package that works instantly and it's offline;\r\n- The 'Jarvis' wake-up keyword was added from the ```picovoice``` package. It requires a free key you can get at https://picovoice.ai/platform/porcupine;\r\n- Fundamental improvements to the local search engine in terms of speed and credit consumption. With this update, accessing information from past conversations gets easier. When the search is completed the AI will summarize the text;\r\n- Using dotenv for easier authenthication; \r\n<br>\r\n---\r\n---\r\n## APRIL 5th 2023 UPDATE: New Voice Models (F.R.I.D.A.Y), Expanding Local Search Engine and More\r\nI finally decided to upgrade the voice model from @ConrentinJ to [@CoquiAI](https://github.com/coqui-ai/tts). This model works on the same principle (Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis) but is much faster, more versatile and offers more options to explore. Right now, I just want to push this version live, it works by default with one of the models offered by the TTS package. In the future, I'll explore the differences and strengths of all the other models (you can do it by changing the name of the model when the Voice is initialized inside ``Voice.py``, as shown in the ``tts_demo.py``). Moreover, this model is multilanguage so if you find any clean interviews of voice actors you can use them as models when the answer needs to be spoken in your (or any) language.\r\n<br>\r\nSecondly, I've made some improvements to the Local Search Engine. Now it can be accessed with voice. In particular:\r\n\r\n 1. Once you've made a prompt, ```LocalSearchEngine.analyze_prompt()``` will try to interpret the prompt and it will produce a flag, ranging from 1 to N (where N is the number of Actions the Assistant can make). The prompt analyzer make use of a sort of *semantic if/else*. The idea is: *if* the **meaning** of the prompt is equal (has high cosine similarity) to the action, *then* return ```True``` *else* return ```False```; \r\n 2. If the flag corresponds to **\"1\"** the associated action will be **\"Look for a file\"** and that protocol will be triggered;\r\n 3. The system will first communicate its intentions and if you confirm, the assistant will ask you to provide some search keywords;\r\n 4. The system will utilize a pandas DataFrame, where some topic tags are associated to the conversation, to detect relevant discussions;\r\n 5. Finally, the system will rank all the files from the most relevant to the least pertinent;\r\n 6. The natural following step would be to recover one of the files, but this is still a work in progress;\r\n<br>\r\n\r\nMinor updates:\r\n - Bug fixes;\r\n - Added ``langid``, ``TextBlob`` and ``translators`` to get faster translations and reduce GPT credit usage;\r\n - Improved Speech-to-text by reducing the possible languages to the ones specified in the Assistant model;\r\n<br>\r\n\r\n---\r\n---\r\n## April 1st 2023 UPDATE: Introducing the Local Search Engine, sounds and more\r\nI managed to build some tools that are capable of reading and abstracting information from textual files (.txt). This tool might be precious in futire when voice commands that handle the Assistant memory will be introduced. The idea is to have specific commands like \"open the last conversation about topic X\" or \"I remember something you said about topic Y can you make a summary of that conversation?\". The LocalSearchEngine can find sort the discussions by relevancy (``cosine_similarity``) making use of embeddings: *an embedding is a vector (list) of floating point numbers. The distance between two vectors measures their relatedness. Small distances suggest high relatedness and large distances suggest low relatedness. [OpenAI - what are embeddings](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings). You can find these inside ``Assistant\\tools.py``*\r\n\r\nThe LocalSearchEngine adopt a 4 step algorithm to compute the relevancy between a key and a text:\r\n1. **Tags extractions**: ask ChatGPT to extract up to 10 topics (tags) that are discussed in the text. This will reduce the noise in the conversation leading to more relevant, high value content;\r\n2. **Translation**: OpenAI Embeddings work in any language but, to maximize the pterormance, the tags are translated in the same language of the key;\r\n3. **Emebdding computation**: using ``text-embedding-ada-002`` to extract the Embeddings from both keys and translated tags;\r\n4. **Cosine Similarity Computation**: use OpenAI ``cosine_similarity()`` to compute the similarity index. Alternatively you coul also ask naively to ChatGPT where some text is relevant to a key or not but the results weren't as good. Sill you could ask for a quick search ``quick_search()``;\r\n\r\n<p align=\"center\">\r\n  <img src=\"https://user-images.githubusercontent.com/49094051/229243205-337b7bfa-2e7b-43b1-a770-62b524367dc6.PNG\" /><br>\r\n  <i><span style=\"color:grey\">Custom DataFrame showing the search results with key \"physics and space exploration\" between some conversation I had.<br> Here you can also see the tags that were extrapolated. Notice that some of them are in italian, since the conversation was held in my native language </span></i> \r\n </p>\r\n\r\n\r\nFurthermore, a ``Translator`` class was implemented as addidtional tool. \r\n\r\nother important updates\"\r\n- introduced a ``VirtualAsssistant`` object to allow a more intuitive ``main`` flow and handle most of the requests ``Assistant\\VirtualAssistant.py``;\r\n- rearranged the directory structure to 'hide' some backend code;\r\n\r\nother minor updates:\r\n- introduced sounds! Now you can have sound feedback of what is happening with the Assistant. Seeing is beliving;\r\n- Made overall code slightly more efficient; \r\n<br>\r\n<br>\r\n---\r\n---\r\n## March 26 2026 UPDATE: Background execution and Hands Free control\r\nThis update is focused on making the assistant more viable in everyday life. \r\n - You can now follow some instructions you can find at <span style=\"color:green\"> BootJARVIS.pdf </span>. to run the main script automaitcally when the system boots;\r\n - There no more need to press Ctrl+C to deliver mic recordings to Wisper. Here hows *Hands Free* and *Summoning* work like:\r\n    1. By default the assistant is at sleep. A python script will however run passively to listen for a special keyword to summon JARVIS. The function is hence ```PassiveListen```, you'll find it inside get_audio.py. It leverage the SpeechRecognition python package that is faster and more versatile than whisper but less accurate. \r\n    2. When the keyword (by default ```'elephant'```* ) is spoken, the assistant wakes up and the following audio is fed to Whisper. You'll need to summon the assistance only once, every time you begin a conversation.\r\n    3. When a conversation begins, the assistant will listen, after you deliver a prompt some time will pass, **3 seconds** (```RESPONSE_TIME``` inside the ```record()``` function in get_audio.py ) before the audio is listened and transcribed by whisper. A conversation is closed when the user say a Closing Keyword (for now 'thanks'). This will trigger the chat saving and will put the assistant back to sleep, waiting for a awakening keyword. \r\n    4. In alternative, the system will go back to sleep automatically if **30 seconds** (variable ```SLEEP_DELAY``` inside the ```record()``` function in get_audio.py ) pass after the response is repoduced.\r\n - minor improvements:\r\n    1. Adding the package pytts3x for text to speech when IBM is un-available (monthly usage expired);\r\n    2. improved CLI outputs;\r\n    3. improved descriptions;\r\n\r\n(*) ```'elephant'``` was chosen as default keyword becaus it' uncommon and understandable. You can change it to 'Hey Jarvis' o 'Hey Siri' but you need to be sure the system catch what you are saying (SpeechRecognition is not that good with fancy names) maybe in future better ways to summon will be thought.\r\n<br>\r\n<br>\r\n\r\n---\r\n---\r\n## March 13 2023 UPDATE: JARVIS VOICE IS HERE!\r\n**How i did it**: I spent a huge amount of time on @CorentinJ github https://github.com/CorentinJ/Real-Time-Voice-Cloning which provides an interface to generate audio from text using a pretrained text-to-speech model. The GUI is pretty clever and I admire his work, however, using the model in a python script is not straight foward! I first edited the toolbox to save **embeddings**, which are the beginning of  the generation process,. They are the \"voice ID\" of the targeted people, expressed in terms of matrix. With this edit, I used the toolbox to generate Paul Bettany's voice embedding. <br>\r\nThen, I wrote down a trimmed version of CorentinJ's toolbox, `JARVIS.py`. This version can load the embedding learned from Jarvis voice and do basic oprations like Synth and vocode upon request from any script. \r\n\r\n![toolbox](https://user-images.githubusercontent.com/49094051/224836993-ee7b4964-e518-46f4-85b1-b25f48f1a78c.PNG)\r\n<p align=\"center\"> Original Toolbox interface: you can see the embedding </p>\r\n"
  },
  {
    "path": "Vicuna/README.md",
    "content": "# OOBABOOGA-UI & VICUNA INSTALLATION\nGuide on how to run a ChatGPT alternative on your PC.\n\n## 1. What are the OOBABOOGA-UI and Vicuna?\nThe [oobabooga text generation web ui](https://github.com/oobabooga/text-generation-webui) is a User Interface that can utilize many open-source Large Language Models like Alpaca, Llama and many others. **It's 100% free and runs on your pc so no connection is required**. The powerful aspect of the Oobabooga interface is that it provides a way to interface with the model through an Application Program Interface, which is handy for this project. <br>\n<p>\n    <img width=\"2348\" alt=\"Vicuna\" src=\"https://raw.githubusercontent.com/oobabooga/screenshots/main/cai3.png\" width=\"10\">\n</p>\n<p align='center'>\n    <span style=\"color:grey\"> Example of conversation in Oobabooga UI </span>\n</p>\n\n[Vicuna 13-b](https://vicuna.lmsys.org/) is an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. Preliminary evaluation using GPT-4 as a judge shows Vicuna achieves more than 90% quality of OpenAI ChatGPT. \n\n## 2. Should I install this? (*)\nProcessing large amounts of information can be quite expensive if you are running on a pay-as-you-go account. Secondly, having a backup solution for when our connection isn't reliable or just when there is too much traffic on OpenAI servers has its pros and cons. For this project, I am planning on using offline, free text processing for PDFs and long website analysis. In this way It'll be possible to let the computer elaborate huge amounts of material in the background while I'll be doing other stuff.\n\n## 3. (*) Pros and cons\nPros:\n 1. **local**, no internet needed;\n 2. **free**;\n 3. reasonably good *(see cons: hallucinations);\n 4. runs on CPU and/or GPU;\n 5. You have some freedom in choosing what opensource model works best for you;\n\nCons:\n 1. **Size**: these are **LARGE** language models. They are designed to take as much RAM/vRAM on your devices. This means that, depending on the performance of your hardware, you might face Out Of Memory errors.\n <p align=\"center\">\n    <img width=\"2348\" alt=\"Vicuna\" src=\"https://user-images.githubusercontent.com/49094051/232077149-882178eb-c73e-4834-b82e-44cafa941666.PNG\">\n</p>\n\n> This snap was taken running Whisper's \"large\" model (10GB) and the GPU Vicuna model. In particular, you can see the RAM getting filled when the model begins to process.\n\n 2. **Speed**: OpenAI services are generally faster. The speed of the answer will depend on your hardware;\n 3. **Multitasking**: due to their size and RAM footprint, multitasking with these software running might be difficult;\n 4. Hallucinations: these models are quite raw and they tend to have long discussions... with themselves! Coding can put a patch on the problem but right now it's still an issue and, sometimes, answers might be *weird*. <span style=\"color:grey\"> For example, Vicuna might ask itself a question after it gave an answer and go on like that until it consumes all tokens!</span>\n\n# INSTALLATION\n## 1. Make a folder anywhere on your computer and click on the folder path:\n <p align=\"center\">\n    <img width=\"2348\" alt=\"Vicuna\" src=\"https://user-images.githubusercontent.com/49094051/232081647-5c5ccc3e-1fc0-45d8-905b-4c91ac67e77f.png\">\n    <span style='color:grey'> Here <i>ChatGPT</i> is where I am keeping the project, while <i>vicuna</i> is the folder I chose to install Vicuna in </span>\n</p>\n\n\n## 3. Type ```powershell``` and hit enter\n\n## 4. run ```iex (irm vicuna.tc.ht)``` from powershell\nThis command will run a One-Click-Installation provided by [TroubleChute](https://hub.tcno.co/ai/text-ai/vicuna/). You can follow the instructions you'll see on screen or check his [step-by-step tutorial](https://youtu.be/d4dk_7FptXk). \n\n## 5. Select CPU or GPU (or both)\nThe Vicuna CPU model ([eachadea/ggml-vicuna-13b-4bit](https://huggingface.co/eachadea/legacy-ggml-vicuna-13b-4bit)) weighs ~15 GB. It can run only on CPU (RAM);<br>\nThe Vicuna GPU model ([anon8231489123/vicuna-13b-GPTQ-4bit-128g](https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g)) weighs ~7.5 GB. It can dynamically allocate itself between RAM and vRAM. <br>\n\nYou can also download manually using the following syntax on CMD: <br>\n```cd oobabooga-windows\\text-generation-webui```<br>\n```python download-model.py facebook/opt-1.3b```<br>\n[Hugging Face](https://huggingface.co/models?pipeline_tag=text-generation&sort=downloads) is the main place to download models. These are some examples:\n* [Pythia](https://huggingface.co/models?sort=downloads&search=eleutherai%2Fpythia+deduped)\n* [OPT](https://huggingface.co/models?search=facebook/opt)\n* [GALACTICA](https://huggingface.co/models?search=facebook/galactica)\n* [GPT-J 6B](https://huggingface.co/EleutherAI/gpt-j-6B/tree/main)\n\n\n## 6. Edit the ```start-webui-vicuna.bat``` or run ```start-webui-vicuna-gpu.bat```\nchange the last line with:<br>\n - if you are using GPU:\n``` call python server.py --wbits 4 --groupsize 128 --listen --no-stream --model anon8231489123_vicuna-13b-GPTQ-4bit-128g --notebook --extension api```\n - If you are using CPU: ```call python server.py --model eachadea_ggml-vicuna-13b-4bit --listen --no-stream --notebook --extension api```\n<br>\nThis edit is already applied in the file ```start-webui-vicuna-gpu.bat``` you'll find in this folder\nYou should be able to run the ```.bat``` file with no errors and the following message should appear: <br>Running on local URL:  http://0.0.0.0:786<br>\nTo create a public link, set `share=True` in `launch()`.\n"
  },
  {
    "path": "Vicuna/start-webui-vicuna-gpu.bat",
    "content": "@echo off\r\n\r\n@echo Starting the web UI...\r\n\r\ncd /D \"%~dp0\"\r\n\r\nset MAMBA_ROOT_PREFIX=%cd%\\installer_files\\mamba\r\nset INSTALL_ENV_DIR=%cd%\\installer_files\\env\r\n\r\nif not exist \"%MAMBA_ROOT_PREFIX%\\condabin\\micromamba.bat\" (\r\n  call \"%MAMBA_ROOT_PREFIX%\\micromamba.exe\" shell hook >nul 2>&1\r\n)\r\ncall \"%MAMBA_ROOT_PREFIX%\\condabin\\micromamba.bat\" activate \"%INSTALL_ENV_DIR%\" || ( echo MicroMamba hook not found. && goto end )\r\ncd text-generation-webui\r\n\r\ncall python server.py --wbits 4 --groupsize 128 --listen --no-stream --model anon8231489123_vicuna-13b-GPTQ-4bit-128g --extensions api\r\n\r\n:end\r\npause\r\n"
  },
  {
    "path": "Vicuna/vicuna.ps1",
    "content": "# Copyright (C) 2023 TroubleChute (Wesley Pyburn)\n# Licensed under the GNU General Public License v3.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     https://www.gnu.org/licenses/gpl-3.0.en.html\n#\n#    This program is free software: you can redistribute it and/or modify\n#    it under the terms of the GNU General Public License as published by\n#    the Free Software Foundation, either version 3 of the License, or\n#    (at your option) any later version.\n#    \n#    This program is distributed in the hope that it will be useful,\n#    but WITHOUT ANY WARRANTY; without even the implied warranty of\n#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\n#    GNU General Public License for more details.\n#    \n#    You should have received a copy of the GNU General Public License\n#    along with this program.  If not, see <https://www.gnu.org/licenses/>.\n#\n# ----------------------------------------\n# This script:\n# 1. Check if current directory is oobabooga-windows, or oobabooga-windows is in directory\n# 2. Run my install script for obabooga/text-generation-webui\n# 3. Tells you how to download the vicuna model, and opens the model downloader.\n# 4. Run the model downloader (Unless CPU-Only mode was selected in install, in which case the CPU model is downloaded)\n# 5. Replace commands in the start-webui.bat file\n# 6. Create desktop shortcuts\n# 7. Run the webui\n# ----------------------------------------\n\nWrite-Host \"Welcome to TroubleChute's Vicuna installer!\" -ForegroundColor Cyan\nWrite-Host \"Vicuna as well as all of its other dependencies and a model should now be installed...\" -ForegroundColor Cyan\nWrite-Host \"[Version 2023-04-11]`n`n\" -ForegroundColor Cyan\n\n# 1. Check if current directory is oobabooga-windows, or oobabooga-windows is in directory\n# If it is, CD back a folder.\n$currentDir = (Get-Item -Path \".\\\" -Verbose).FullName\nif ($currentDir -like \"*\\oobabooga-windows\") {\n    Set-Location ../\n}\n\n$containsFolder = Get-ChildItem -Path \".\\\" -Directory -Name | Select-String -Pattern \"oobabooga-windows\"\nif ($containsFolder) {\n    Write-Host \"The 'oobabooga-windows' folder already exists.\" -ForegroundColor Cyan\n    $downloadAgain = Read-Host \"Do you want to download it again? (Y/N)\"\n\n    if ($downloadAgain -eq \"Y\" -or $downloadAgain -eq \"y\") {\n        # Perform the download again\n        $containsFolder = $False\n    }\n}\n\nif (-not $containsFolder) {\n    Write-Host \"I'll start by installing Oobabooga first, then we'll get to the model...`n`n\"\n    \n    #2. Choose CPU or GPU installation\n    Write-Host \"Do you have an NVIDIA GPU?\" -ForegroundColor Cyan\n    Write-Host \"Enter anything but y or n to skip.\" -ForegroundColor Yellow\n\n    $choice = Read-Host \"Answer (y/n)\"\n\n    $skip_model = 1\n    $skip_start = 1\n\n    if ($choice -eq \"Y\" -or $choice -eq \"y\") {\n        Write-Host \"Installing GPU & CPU compatible version\" -ForegroundColor Cyan\n        Write-Host \"If this fails, please delete the folder and choose 'N'\" -ForegroundColor Cyan\n        $gpu = \"Yes\"\n        # 2. Run my install script for obabooga/text-generation-webui\n        iex (irm ooba.tc.ht)\n    }\n    elseif ($choice -eq \"N\" -or $choice -eq \"n\") {\n        Write-Host \"Installing CPU-Only version\" -ForegroundColor Cyan\n        $gpu = \"No\"\n        # 2. Run my install script for obabooga/text-generation-webui\n        iex (irm ooba.tc.ht)\n    }\n\n} else {\n    # CD into folder anyway\n    Set-Location \"./oobabooga-windows\"\n}\n\nfunction Get-VicunaCPU() {\n    # Download CPU model (only the updated one)\n    # If downloaded using model downloader, another 8.14 GB download will be run...\n    $url = \"https://huggingface.co/eachadea/ggml-vicuna-13b-4bit/resolve/main/ggml-vicuna-13b-4bit-rev1.bin\"\n    $outputPath = \"text-generation-webui\\models\\eachadea_ggml-vicuna-13b-4bit\\ggml-vicuna-13b-4bit-rev1.bin\"\n\n    # Download the file from the URL\n    Write-Host \"Downloading: eachadea/ggml-vicuna-13b-4bit (CPU model)\" -ForegroundColor Cyan\n    Get-Aria2File -Url $url -OutputPath $outputPath\n    Write-Host \"`nDone!`n\"\n}\nfunction Get-VicunaGPU() {\n    # Download GPU/CUDA model\n    $blob = \"https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g/resolve/main\"\n    $outputPath = \"text-generation-webui\\models\\anon8231489123_vicuna-13b-GPTQ-4bit-128g\"\n\n    # Download the file from the URL\n    Write-Host \"Downloading: anon8231489123/vicuna-13b-GPTQ-4bit-128g (GPU/CUDA model)\" -ForegroundColor Cyan\n    $files = @(\n        \"vicuna-13b-4bit-128g.safetensors\",\n        \"tokenizer_config.json\",\n        \"tokenizer.model\",\n        \"special_tokens_map.json\",\n        \"pytorch_model.bin.index.json\",\n        \"generation_config.json\",\n        \"config.json\"\n    )\n\n    Get-Aria2Files -Url $blob -OutputPath $outputPath -Files $files\n    Write-Host \"`nDone!`n\"\n}\n\n# Allow importing remote functions\niex (irm Import-RemoteFunction.tc.ht)\nImport-FunctionIfNotExists -Command Get-Aria2File -ScriptUri \"File-DownloadMethods.tc.ht\"\n\n# Create the output folder if it does not exist\nNew-Item -ItemType Directory -Force -Path (Split-Path -Parent \"text-generation-webui\\models\\eachadea_ggml-vicuna-13b-4bit\") | Out-Null\nif (-not $?) {\n    Write-Error \"Failed to create directory.\"\n}\n\nif ($gpu -eq \"No\") {\n    Get-VicunaCPU\n} else {\n    Write-Host \"`n`nPick which models to download:\" -ForegroundColor Cyan\n    Write-Host -NoNewline \"CPU (7.5GB): \" -ForegroundColor Red\n    Write-Host \"1\" -ForegroundColor Green\n    Write-Host -NoNewline \"GPU [Nvidia] (6.9GB): \" -ForegroundColor Red\n    Write-Host \"2\" -ForegroundColor Green\n    Write-Host -NoNewline \"CPU + GPU [Nvidia] (14.4GB): \" -ForegroundColor Red\n    Write-Host \"3\" -ForegroundColor Green\n    \n    $num = Read-Host \"Enter a number\"\n    if ($num -eq \"1\") {\n        Get-VicunaCPU\n    } elseif ($num -eq \"2\") {\n        Get-VicunaGPU\n    } elseif ($num -eq \"3\") {\n        Get-VicunaCPU\n        Get-VicunaGPU\n    }\n}\n\n# 5. Replace commands in the start-webui.bat file\n# Create CPU and GPU versions\nCopy-Item \"start-webui.bat\" \"start-webui-vicuna.bat\"\n\nif (-not ($gpu -eq \"No\")) {\n    (Get-Content -Path \"start-webui-vicuna.bat\") | ForEach-Object {\n        $_ -replace\n            'call python server\\.py --auto-devices --cai-chat',\n            'call python server.py --auto-devices --chat --model anon8231489123_vicuna-13b-GPTQ-4bit-128g --wbits 4 --groupsize 128'\n    } | Set-Content -Path \"start-webui-vicuna-gpu.bat\"\n}\n\n(Get-Content -Path \"start-webui-vicuna.bat\") | ForEach-Object {\n    $_ -replace\n        'call python server\\.py --auto-devices --cai-chat',\n        'call python server.py --auto-devices --chat --model eachadea_ggml-vicuna-13b-4bit'\n} | Set-Content -Path \"start-webui-vicuna.bat\"\n\n# 6. Create desktop shortcuts\nif ($gpu -eq \"No\") {\n    Write-Host \"`n`nCreate desktop shortcuts for 'Vicuna (CPU)'\" -ForegroundColor Cyan\n} else {\n    Write-Host \"`n`nCreate desktop shortcuts for 'Vicuna' and 'Vicuna (CPU)'\" -ForegroundColor Cyan\n}\n$shortcuts = Read-Host \"Do you want desktop shortcuts? (Y/N)\"\n\nif ($shortcuts -eq \"Y\" -or $shortcuts -eq \"y\") {\n    iex (irm Import-RemoteFunction.tc.ht) # Get RemoteFunction importer\n    Import-RemoteFunction -ScriptUri \"https://New-Shortcut.tc.ht\" # Import function to create a shortcut\n    \n    Write-Host \"Downloading Vicuna icon...\"\n    Invoke-WebRequest -Uri 'https://tc.ht/PowerShell/AI/vicuna.ico' -OutFile 'vicuna.ico'\n    if (-not ($gpu -eq \"No\")) {\n        Write-Host \"`nCreating shortcuts on desktop...\" -ForegroundColor Cyan\n        $shortcutName = \"Vicuna oobabooga\"\n        $targetPath = \"start-webui-vicuna-gpu.bat\"\n        $IconLocation = 'vicuna.ico'\n        New-Shortcut -ShortcutName $shortcutName -TargetPath $targetPath -IconLocation $IconLocation\n    }\n\n    $shortcutName = \"Vicuna (CPU) oobabooga\"\n    $targetPath = \"start-webui-vicuna.bat\"\n    $IconLocation = 'vicuna.ico'\n    New-Shortcut -ShortcutName $shortcutName -TargetPath $targetPath -IconLocation $IconLocation\n    \n}\n\n# 7. Run the webui\nif ($gpu -eq \"No\") {\n    Start-Process \".\\start-webui-vicuna.bat\"\n} else {\n    # Ask user if they want to launch the CPU or GPU version\n    Write-Host \"`n`nEnter 1 to launch CPU version, or 2 to launch GPU version\" -ForegroundColor Cyan\n    \n    $choice = Read-Host \"1 (CPU) or 2 (GPU)\"\n    \n    if ($choice -eq \"1\") {\n        Start-Process \".\\start-webui-vicuna.bat\"\n    }\n    elseif ($choice -eq \"2\") {\n        Start-Process \".\\start-webui-vicuna-gpu.bat\"\n    }\n    else {\n        Write-Host \"Invalid choice. Please enter 1 or 2.\"\n    }\n}\n"
  },
  {
    "path": "demos/chat_with_keyboard.py",
    "content": "import openai\nimport os\nfrom dotenv import load_dotenv\nload_dotenv()\n\"\"\"\nhave chats with CHATGPT straight from your command line (useful for code)\n\"\"\"\n\n\nCHAT =  [{\"role\": \"system\", \"content\": \"You are a helpful assistant. You can make question to make the conversation entertaining.\"}]\n\n# Set your openai api key\nopenai.api_key = os.getenv('OPENAI_API_KEY')\n\nwhile True:\n    try:\n        question = input('[user]: ')\n        CHAT.append({\"role\":\"user\",\"content\":f\"{str(question)}\"})\n        API_response = openai.ChatCompletion.create(\n                    model='gpt-3.5-turbo',\n                    messages=CHAT)\n                \n        answer = API_response['choices'][0]['message']['content']\n        print(f\"[assistant]: {answer}\")\n\n        CHAT.append({\"role\":\"assistant\",\"content\":f\"{answer}\"})\n    except Exception as e:\n        print(e)\n        quit()"
  },
  {
    "path": "demos/demo_da_vinci.py",
    "content": "import openai\n\n## SET YOUR API KEY:\nopenai.apikey('')\n\ndef generate_single_response(prompt, model_engine=\"text-davinci-003\", temp=0.5):\n    openai.api_key = 'your openai api key'  \n    prompt = (f\"{prompt}\")\n    completions = openai.Completion.create(\n        engine=model_engine,\n        prompt=prompt,\n        max_tokens=1024,\n        temperature=temp\n    )\n    return completions.choices[0].text\n\ndef get_answer(message):\n    CHAT = [{\"role\": \"system\", \"content\": \"\"\"You are a prompt manager. \n    You have access to a suite of actions and decide if a prompt require one of them to be executed. Your actions:\n    1 - find a file: file can store past conversations, textual information;\n    2 - respond: the user wants to have a general answer to a topic;\n    3 - adjust settings;\n    4 - find a directory inside the Computer;\n    5 - save this conversation for the future\n    6 - check on internet\n    7 - None of the above\n    \n    You can answer only with numbers. A number must always be present in your answer\"\"\"},{\"role\":\"user\", \"content\":\"PROMPT: \"+message}]\n\n    response = openai.ChatCompletion.create(\n                model=\"gpt-3.5-turbo\",\n                temperature=0,\n                max_tokens=1,\n                messages=CHAT)\n    \n    return response\n\n\n\nif __name__ == \"__main__\":\n    \n    while True:\n        message = input(\"[user]: \")\n        answer = get_answer(message=message)\n        print(answer['choices'][0]['message']['content'])"
  },
  {
    "path": "demos/demo_elevenlabs.py",
    "content": "import elevenlabslib\nfrom dotenv import load_dotenv\nimport os\nfrom pydub import AudioSegment\nimport io\nimport pygame\n\nload_dotenv()\n\napi_key = os.getenv('ELEVENLABS_API_KEY')\n\nuser = elevenlabslib.ElevenLabsUser(api_key)\n\ntext = \"This is the documentation for the ElevenLabs API. You can use this API to use our service programmatically, this is done by using your xi-api-key\"\nwrite_dir = os.path.join('Assistant', 'answers')\n\nelevenlabs_voice = user.get_voices_by_name('Antoni')[0]\naudio = elevenlabs_voice.generate_audio_bytes(text)\n\naudio = AudioSegment.from_file(io.BytesIO(audio), format=\"mp3\")\naudio.export(os.path.join(write_dir, \"speech.wav\"), format=\"wav\")\nif pygame.mixer.get_init() is None: pygame.mixer.init()\ntry:\n    pygame.mixer.music.load(os.path.join(write_dir, \"speech.wav\"))\nexcept Exception as e:\n    print(e)\n\npygame.mixer.music.set_volume(0.5)\npygame.mixer.music.play()\nwhile(pygame.mixer.music.get_busy()):pass"
  },
  {
    "path": "demos/demo_google_search.py",
    "content": "from langchain.utilities import GoogleSerperAPIWrapper\nfrom dotenv import load_dotenv\nload_dotenv()\nfrom langchain.memory import ConversationBufferMemory\nfrom langchain.llms.openai import OpenAI\nfrom langchain import OpenAI, LLMChain \nfrom langchain.agents import initialize_agent, Tool, load_tools\nfrom langchain.agents import AgentType, ZeroShotAgent, AgentExecutor\n\nimport geocoder\nfrom newsapi import NewsApiClient\nimport os\n\n\n\ndef generateGoogleAgent():\n    names = ['wikipedia', 'requests', 'open-meteo-api']\n    \n    if len(os.getenv('SERPER_API_KEY'))>1:\n        print('(using google-seper)')\n        names.append('google-serper')\n    elif len(os.getenv('GOOGLE_API_KEY'))>1:\n        print('(using google-serch)')\n        names.append('google-search')\n    \n    tools = load_tools(names)\n    custom_tools = [\n        Tool(\n            name ='Locate me',\n            func = locate_me, \n            description='useful to know the current geographical location'),\n        Tool(\n            name='News',\n            func=news,\n            description='Use this when you want to get information about the top headlines of current news stories. The input should be a keyword describing the topic')\n        ]\n    \n    for item in custom_tools: tools.append(item)\n        \n    prefix = \"\"\"Answer the question. You have also access to the following tools:\"\"\"\n    suffix = \"\"\"Begin!\"\n\n    {chat_history}\n    Question: {input}\n    {agent_scratchpad}\"\"\"\n\n    prompt = ZeroShotAgent.create_prompt(\n        tools, \n        prefix=prefix, \n        suffix=suffix, \n        input_variables=[\"input\", \"chat_history\", \"agent_scratchpad\"]\n    )\n    llm_chain = LLMChain(llm=OpenAI(temperature=0), prompt=prompt)\n    agent = ZeroShotAgent(llm_chain=llm_chain, tools=tools, verbose=True)\n\n    memory = ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True, human_prefix='user', ai_prefix='assistant')\n    \n    return AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True, memory=memory, early_stopping_method = 'generate', max_iterations=2)\n\n\ndef news(keyword):\n    newsapi = NewsApiClient(api_key=os.getenv('NEWS_API_KEY'))\n    top_headlines = newsapi.get_top_headlines(q=keyword)\n    if len(top_headlines['articles'])==0:\n        top_headlines = newsapi.get_everything(q=keyword)\n        top_headlines['articles'] = top_headlines['articles'][0:min(len(top_headlines['articles']),10)]\n\n    res = ''\n\n    for article in top_headlines['articles']:\n        res += '\\n'+article['title']+'\\nurl: '+article['url']+'\\n'\n\n    print(res)\n    return res\n\ndef locate_me(p):\n    g = geocoder.ip('me')\n    loc = str(g.city)+', '+ str(g.state)+', '+str(g.country) \n    print(loc)\n    return loc\n\nllm = OpenAI(temperature=0)\n\nnames = [\"requests\"]\n\n\ntools = load_tools(names)\ntools.append(Tool(\n        name='Locate Me',\n        func=locate_me,\n        description='Helpful when you need to access the current geographical postition',\n        ))\n\ntools.append(Tool(\n    name='News',\n    func=news,\n    description='Use this when you want to get information about the top headlines of current news stories. The input should be a keyword describing the topic'\n))\n\n\nself_ask_with_search = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)\nself_ask_with_search = generateGoogleAgent()\nprint(self_ask_with_search.run(\"What are the latest news about AI?\"))"
  },
  {
    "path": "demos/demo_local_search_engine.py",
    "content": "import openai\nimport os\nimport whisper\nimport pyaudio\nimport sys\nimport pygame\nfrom datetime import datetime\nimport copy\nimport time\nimport pyttsx3\nimport pandas as pd\nimport ast\n\nfrom ibm_watson import TextToSpeechV1\nfrom ibm_cloud_sdk_core.authenticators import IAMAuthenticator\n\nimport backup.get_audio as myaudio\nfrom Assistant.VirtualAssistant import VirtualAssistant\n\nopenai.api_key = 'your-openai-api-key'\n\ntext2 =\"\"\"user: What are transformers architecture and how can they be used in segmentation?\nassistant: The transformer architecture is a deep learning model that has been widely used in natural language processing applications due to its impressive performance in language translation and text generation tasks. The model is based on an attention mechanism that allows it to selectively focus on different parts of the input sequence, enabling it to capture long-range dependencies between words.\nIn recent years, researchers have started exploring the applicability of transformer models in computer vision tasks such as image segmentation. One of the advantages of using transformers for segmentation is their ability to process images as sequences of tokens. This means that the model can take into account the spatial relationship between pixels in an image, allowing it to produce more contextually aware segmentations.\nThere are different ways to incorporate transformers into segmentation models. One approach is to use a pre-trained transformer model, such as the Vision Transformer (ViT), to encode the input image into a sequence of feature vectors. These feature vectors are then fed into a segmentation head that produces pixel-wise predictions for each class.\nAnother approach is to use transformers in a fully self-supervised setting, where the model is trained to predict the color or texture of a patch given its surrounding context. This pre-training step allows the model to learn powerful image representations that can be used for downstream tasks like segmentation.\nIn summary, transformers have shown great potential in segmentation tasks by allowing models to capture complex spatial relationships in image data. However, like any deep learning model, they require significant computational resources and training data to achieve state-of-the-art performance.\n\"\"\"\n\ntext3 = \"\"\"system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words. user: What can you tell me about electric propulsion in rockets?\n        assistant: Electric propulsion is a type of rocket propulsion that uses electrical energy to accelerate propellant and generate thrust. It is often used in spacecraft propulsion systems for missions that require precise control and long-duration high-speed travel. \nThere are different types of electric propulsion systems, but they all work on the same principle: accelerating ions or electrons using an electric field. This is in contrast to traditional chemical rockets, which generate thrust by combusting fuel.\nOne type of electric propulsion is called electrically-powered ion propulsion, which uses ions that are generated by an electrically charged grid to propel the spacecraft. This system uses a low amount of fuel but produces a weak thrust, making it ideal for long-duration missions where high velocities are required.\n\"\"\"\n\n\nfrom Assistant.tools import LocalSearchEngine\nfrom Assistant.tools import Translator\n\nsearch = LocalSearchEngine()\ntr = Translator()\n\nfrom tests import load_keys\nload_keys()\n\nsearch.produce_folder_tags()\n"
  },
  {
    "path": "demos/demo_pyaudio.py",
    "content": "import Assistant.get_audio as myaudio\n\nprint('get_devices() print all devices:')\nprint(myaudio.get_devices())\n\nprint('\\n\\n')\n\nprint('detect_microphones() print mics only:')\nprint(myaudio.detect_microphones())\n\nprint('\\n')\nprint('get_device_channels() return channel count for each device\\n', myaudio.get_device_channels())\n\n# select the first available microphone\nM = myaudio.get_devices()[0]['name']\nprint(M)"
  },
  {
    "path": "demos/demo_research_mode.py",
    "content": "print('### LOADING CREDENTIALS ###')\nfrom dotenv import load_dotenv\nimport os\n\nfrom Assistant.research_mode import ResearchAssistant\nfrom Assistant.semantic_scholar.agent_tools import readPDF, update_workspace_dataframe\n\nload_dotenv()\n\nif len(os.environ['OPENAI_API_KEY'])==0: \n    print('openai API key not detected in .env')\n    raise Exception(\"[$] openai API key is required. Learn more at https://platform.openai.com/account/api-keys\")\n\nif len(os.environ['IBM_API_KEY'])==0: print('[free] IBM cloud API Key not detected in .env\\nLearn more at: https://cloud.ibm.com/catalog/services/text-to-speech')\n\nif len(os.environ['IBM_TTS_SERVICE'])==0: print('[free] IBM cloud TTS service not detected in .env\\nLearn more at: https://cloud.ibm.com/catalog/services/text-to-speech')\n\nuse_porcupine = True\nif len(os.environ['PORCUPINE_KEY']) == 0: \n    print('[free] PicoVoice not detected in .env\\nLearn more at: https://picovoice.ai/platform/porcupine/')\n    use_porcupine = False\n\n\nprint('DONE\\n')\n\nprint('### IMPORTING DEPENDANCIES ###')\nimport pygame\n\nfrom Assistant import get_audio as myaudio\nfrom Assistant.VirtualAssistant import VirtualAssistant\nfrom Assistant.tools import count_tokens\nimport pinecone\nprint('DONE\\n')\n\n### MAIN\nif __name__==\"__main__\":\n    print(\"### SETTING UP ENVIROMENT ###\")\n    OFFLINE = False\n    pygame.mixer.init()\n    # INITIATE JARVIS\n    print('initiating JARVIS voice...')\n    jarvis = VirtualAssistant(\n        openai_api   = os.getenv('OPENAI_API_KEY'),\n        ibm_api      = os.getenv('IBM_API_KEY'),\n        ibm_url      = os.getenv('IBM_TTS_SERVICE'),\n        elevenlabs_api = os.getenv('ELEVENLABS_API_KEY'),\n        elevenlabs_voice = 'Antoni',\n        voice_id     = {'en':'jarvis_en'},\n        whisper_size = 'medium',\n        awake_with_keywords=[\"jarvis\"],\n        model= \"gpt-3.5-turbo\",\n        embed_model= \"text-embedding-ada-002\",\n        RESPONSE_TIME = 3,\n        SLEEP_DELAY = 30,\n        mode = 'RESEARCH',\n        )\n\n    jarvis.init_research_mode()\n    i = 0\n    while True:  \n        prompt = input(\"user: \")\n        # check exit command\n        if \"THANKS\" in prompt.upper() or len(prompt.split())<=1:\n            jarvis.go_to_sleep()\n            continue\n        \n        jarvis.expand_conversation(role=\"user\", content=prompt)\n\n        # PROMPT MANAGING [BETA]\n        flag = jarvis.analyze_prompt(prompt)\n\n        print(flag)\n        # redirect the conversation to an action manager or to the LLM\n        if \"1\" in flag or \"tool\" in flag:\n            print('(thought): action')\n            response = jarvis.use_tools(prompt)\n            response = response\n        \n        elif \"2\" in flag or \"respond\" in flag:\n            print('(thought): response')\n            response = jarvis.get_answer(prompt)\n        else:\n            print(f'(thought): {flag}: workspace')\n            input('> continue?')\n            response = jarvis.secondary_agent(prompt)\n            \n\n        jarvis.expand_conversation(role='assistant', content=response)\n        pygame.mixer.stop()\n        jarvis.say(response, VoiceIdx='en', elevenlabs=True)\n\n        i+=1"
  },
  {
    "path": "demos/demo_tts.py",
    "content": "from TTS.api import TTS\nimport os\n\n\n\"\"\"\nDESCRIPTION: Clone some voices in Real Time with TTS! run the scripts and listen to output.wav\n\"\"\"\n\nTEXT = \"\"\" OpenAI provides a toolkit called \"OpenAI API\" that allows you to compute multi-head self-attention for a given text. The OpenAy I A Pi I provides access to a range of pre-trained language models, such as Gee Pee Tee 2 and Gee Pee Tee 3, which are capable of performing multi-head self-attention on text data.\nTo use the OpenAI API, you first need to sign up for an API key and then import the A P I client into your Python environment. Once you have set up the A P I client, you can use it to send requests to the OpenAI API server, which will process your text data and return a response containing the multi-head self-attention weights.\nThe OpenAI API offers a simple and straightforward way to access powerful language models and compute complex natural language processing tasks, such as multi-head self-attention, without the need for extensive computational resources or expertise in machine learning.\n\"Exploring Multi-Head Self-Attention for Keyword Identification in Natural Language Processing\"\n\n\"\"\"\n\n#   Running a multi-speaker and multi-lingual model\n\n#   List available 🐸TTS models and choose the first one\nmodel_name = TTS.list_models()[0]\n#   Init TTS\ntts = TTS(model_name)\n#   Run TTS\n#    ❗ Since this model is multi-speaker and multi-lingual, we must set the target speaker and the language\n#   Text to speech with a numpy output\n\nwav = tts.tts(\"This is a test! This is also a test!!\", speaker=tts.speakers[0], language=tts.languages[0])\n#    Text to speech to a file\ntts.tts_to_file(text=\"Hello world!\", speaker=tts.speakers[0], language=tts.languages[0], file_path=\"output.wav\")\n\n#   Running a single speaker model\n\n#   Init TTS with the target model name\ntts = TTS(model_name=\"tts_models/de/thorsten/tacotron2-DDC\", progress_bar=False, gpu=False)\n\n#   Run TTS\ntts.tts_to_file(text=\"Ich bin eine Testnachricht.\", file_path='.')\n\n#   Example voice cloning with YourTTS in English, French and Portuguese:\ntts = TTS(model_name=\"tts_models/multilingual/multi-dataset/your_tts\", progress_bar=False, gpu=True)\ntts.tts_to_file(text=TEXT, speaker_wav=os.path.join(os.getcwd(),'voices','JARVIS','PaulBettanyLongMP3.mp3'), language=\"en\", file_path=\"output.wav\")\n"
  },
  {
    "path": "env.txt",
    "content": "OPENAI_API_KEY = ''\r\nIBM_API_KEY = ''\r\nIBM_TTS_SERVICE = ''\r\nPORCUPINE_KEY = ''\r\nELEVENLABS_API_KEY = ''\r\nHUGGINGFACEHUB_API_TOKEN = ''\r\nSERPER_API_KEY = ''\r\nNEWS_API_KEY =''\r\nGOOGLE_API_KEY = ''\r\nGOOGLE_CSE_ID = ''\r\nOPENWHEATHERMAP_API_KEY = ''\r\nS2_API_KEY = ''\r\n"
  },
  {
    "path": "openai_api_chatbot.py",
    "content": "print('### LOADING CREDENTIALS ###')\r\nfrom dotenv import load_dotenv\r\nimport os\r\n\r\nfrom Assistant.research_mode import ResearchAssistant\r\n\r\nload_dotenv()\r\n\r\nif len(os.environ['OPENAI_API_KEY'])==0: \r\n    print('openai API key not detected in .env')\r\n    raise Exception(\"[$] openai API key is required. Learn more at https://platform.openai.com/account/api-keys\")\r\n\r\nif len(os.environ['IBM_API_KEY'])==0: print('[free] IBM cloud API Key not detected in .env\\nLearn more at: https://cloud.ibm.com/catalog/services/text-to-speech')\r\n\r\nif len(os.environ['IBM_TTS_SERVICE'])==0: print('[free] IBM cloud TTS service not detected in .env\\nLearn more at: https://cloud.ibm.com/catalog/services/text-to-speech')\r\n\r\nuse_porcupine = True\r\nif len(os.environ['PORCUPINE_KEY']) == 0: \r\n    print('[free] PicoVoice not detected in .env\\nLearn more at: https://picovoice.ai/platform/porcupine/')\r\n    use_porcupine = False\r\n\r\n\r\nprint('DONE\\n')\r\n\r\nprint('### IMPORTING DEPENDANCIES ###')\r\nimport pygame\r\n\r\nfrom Assistant import get_audio as myaudio\r\nfrom Assistant.VirtualAssistant import VirtualAssistant\r\nfrom Assistant.tools import count_tokens\r\n\r\nprint('DONE\\n')\r\n\r\n### MAIN\r\nif __name__==\"__main__\":\r\n    print(\"### SETTING UP ENVIROMENT ###\")\r\n    OFFLINE = False\r\n    pygame.mixer.init()\r\n\r\n    # INITIATE JARVIS\r\n    print('initiating JARVIS voice...')\r\n    jarvis = VirtualAssistant(\r\n        openai_api   = os.getenv('OPENAI_API_KEY'),\r\n        ibm_api      = os.getenv('IBM_API_KEY'),\r\n        ibm_url      = os.getenv('IBM_TTS_SERVICE'),\r\n        elevenlabs_api = os.getenv('ELEVENLABS_API_KEY'),\r\n        elevenlabs_voice = 'Antoni',\r\n        voice_id     = {'en':'jarvis_en'},\r\n        whisper_size = 'medium',\r\n        awake_with_keywords=[\"jarvis\"],\r\n        model= \"gpt-3.5-turbo\",\r\n        embed_model= \"text-embedding-ada-002\",\r\n        RESPONSE_TIME = 3,\r\n        SLEEP_DELAY = 30,\r\n        mode = 'CHAT'\r\n        )\r\n\r\n    while True:\r\n        if not(jarvis.is_awake):\r\n            print('\\n awaiting for triggering words...')\r\n\r\n            #block until the wakeword is heard, using porcupine\r\n            if use_porcupine:\r\n                jarvis.block_until_wakeword()\r\n            else:\r\n                while not(jarvis.is_awake):\r\n                    jarvis.listen_passively()\r\n        \r\n        jarvis.record_to_file('output.wav')\r\n        \r\n\r\n        if jarvis.is_awake:\r\n            prompt, detected_language = myaudio.whisper_wav_to_text('output.wav', jarvis.interpreter, prior=jarvis.languages.keys())\r\n\r\n            # check exit command\r\n            if \"THANKS\" in prompt.upper() or len(prompt.split())<=1:\r\n                jarvis.go_to_sleep()\r\n                continue\r\n            \r\n            if detected_language=='en':\r\n                VoiceIdx = 'jarvis'\r\n            else:\r\n                VoiceIdx = detected_language\r\n            \r\n            jarvis.expand_conversation(role=\"user\", content=prompt)\r\n\r\n            # PROMPT MANAGING [BETA]\r\n            #flag = jarvis.analyze_prompt(prompt)\r\n            flag = '-1'\r\n\r\n            # redirect the conversation to an action manager or to the LLM\r\n            if ((\"1\" in flag or \"tool\" in flag) and '-' not in flag):\r\n                print('(though): action')\r\n                response = jarvis.use_tools(prompt)\r\n                response = response\r\n            \r\n            elif \"2\" in flag or \"respond\" in flag:\r\n                print('(though): response')\r\n                response = jarvis.get_answer(prompt)\r\n            elif \"-1\" in flag:\r\n                response = jarvis.switch_mode()\r\n            else:\r\n                print('(though): internet')\r\n                response = jarvis.secondary_agent(prompt)\r\n\r\n            jarvis.expand_conversation(role='assistant', content=response)\r\n            pygame.mixer.stop()\r\n            jarvis.say(response, VoiceIdx=VoiceIdx, IBM=True)\r\n\r\n            print('\\n')"
  },
  {
    "path": "run.bat",
    "content": "call .\\jarvis_venv\\Scripts\\activate\npython openai_api_chatbot.py\npause\n"
  },
  {
    "path": "saved_chats/2023-03-13_ExploringtheConceptofDeepLearningandItsApplicationsintheAviationIndustry.txt",
    "content": "system: You are a helpful assistant.\nuser: What is deep learning?\nassistant: \n\nDeep learning is a subfield of machine learning that involves training artificial neural networks with a large amount of data to solve complex problems. These neural networks are composed of layers of interconnected nodes that process information and learn to make predictions or decisions by adjusting the weights of the connections between the nodes. Deep learning has achieved notable success in areas such as image recognition, natural language processing, and speech recognition, and has been used to create applications such as self-driving cars, intelligent personal assistants, and facial recognition systems.\nuser: , can i use deep learning to fly a plane?\nassistant: Deep learning has been used in the aviation industry for various purposes, such as predicting the weather and optimizing fuel consumption. However, it is not currently feasible to use deep learning to fly a plane autonomously without human intervention. The aviation industry heavily relies on safety protocols and regulations, and the risk of using AI (including deep learning) as the sole decision-maker for a complex task such as flying a plane is likely to be perceived as too great at this point.\n"
  },
  {
    "path": "saved_chats/2023-03-25_ExploringtheAmbitiousWorldofStarCitizenIsitRightforCasualGamers.txt",
    "content": "system: You are a helpful assistant.\nuser: What can you tell me about the video game Star Citizen?\nassistant: Star Citizen is an ambitious multiplayer space trading and combat game developed by Cloud Imperium Games. It was announced in 2011 and has been in development ever since. The game is built using Amazon Lumberyard engine and features an expansive, persistent universe where players can engage in various activities like exploration, mining, trade, combat, and more.\n\nOne of the game's unique features is its ability to seamlessly switch between first-person and third-person views, as well as the ability to explore and interact with planets, ships, and stations.\n\nThe game has generated a lot of excitement among fans of space sim games, and it has raised over $350 million through crowdfunding, making it one of the most successful crowdfunding campaigns for a video game. However, the game has also faced several delays and controversies, with some critics questioning its development timeline and management. Despite this, the game continues to have a passionate following and is still in development.\nuser: Would you suggest it to a person who is loving about open-world games but doesn't want to spend a lot of time in a game?\nassistant: As of now, Star Citizen is a very ambitious game that is still in development and is not yet considered a fully released game. While it promises to have an open-world gameplay experience, it may not be ideal for a person who isn't interested in investing a lot of time in the game due to its complex gameplay mechanics and features. \n\nAdditionally, since the game is still in development, it may not be fully optimized and may have bugs or glitches that could make it frustrating to play. \n\nIf the person is not particularly interested in space sim games or doesn't want to spend a lot of time in a game, there are several other open-world games that they might enjoy such as The Elder Scrolls V: Skyrim, Grand Theft Auto V, Red Dead Redemption 2, or Fallout 4 which offer a more traditional open-world gameplay experience.\n"
  },
  {
    "path": "saved_chats/2023-03-26_AnIntroductiontoArtificialIntelligence.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: , what can you tell me about artificial intelligence?\nassistant: Artificial intelligence (AI) is a branch of computer science that aims to create intelligent machines. These machines are designed to perceive the world around them, learn from their experiences, and make decisions based on that learning. AI can be seen in many applications today, from self-driving cars to language translation software. Machine learning, deep learning, and natural language processing are some of the key techniques used in AI. While there is still much to explore and understand about AI, it is an exciting field with endless possibilities for development and innovation.\n"
  },
  {
    "path": "saved_chats/2023-03-26_ArtificialIntelligence.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: , what can you tell me about artificial intelligence?\nassistant: Artificial intelligence (AI) is a branch of computer science that aims to create intelligent machines. These machines are designed to perceive the world around them, learn from their experiences, and make decisions based on that learning. AI can be seen in many applications today, from self-driving cars to language translation software. Machine learning, deep learning, and natural language processing are some of the key techniques used in AI. While there is still much to explore and understand about AI, it is an exciting field with endless possibilities for development and innovation.\n"
  },
  {
    "path": "saved_chats/2023-03-26_ExploringthePossibilityofHumanExtinction.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: What are the chances of humankind to be dead?\nassistant: It is difficult to predict the exact chances of humankind becoming extinct. However, there are several factors that could potentially threaten the survival of our species. These include natural disasters such as severe weather events, pandemics, and the impact of climate change. Additionally, human activities such as nuclear war, environmental degradation, and the development of dangerous technologies could also contribute to the risk of extinction. While the likelihood of any one of these scenarios occurring is uncertain, it is important to take steps to mitigate these risks and work towards a sustainable future for humanity.\nuser: Thank you for watching!\nassistant: You're welcome! If you have any more questions, feel free to ask.\n"
  },
  {
    "path": "saved_chats/2023-03-26_ExploringtheUseofTransformersinImageSegmentation.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: What are transformers architecture and how can they be used in segmentation?\nassistant: The transformer architecture is a deep learning model that has been widely used in natural language processing applications due to its impressive performance in language translation and text generation tasks. The model is based on an attention mechanism that allows it to selectively focus on different parts of the input sequence, enabling it to capture long-range dependencies between words.\n\nIn recent years, researchers have started exploring the applicability of transformer models in computer vision tasks such as image segmentation. One of the advantages of using transformers for segmentation is their ability to process images as sequences of tokens. This means that the model can take into account the spatial relationship between pixels in an image, allowing it to produce more contextually aware segmentations.\n\nThere are different ways to incorporate transformers into segmentation models. One approach is to use a pre-trained transformer model, such as the Vision Transformer (ViT), to encode the input image into a sequence of feature vectors. These feature vectors are then fed into a segmentation head that produces pixel-wise predictions for each class.\n\nAnother approach is to use transformers in a fully self-supervised setting, where the model is trained to predict the color or texture of a patch given its surrounding context. This pre-training step allows the model to learn powerful image representations that can be used for downstream tasks like segmentation.\n\nIn summary, transformers have shown great potential in segmentation tasks by allowing models to capture complex spatial relationships in image data. However, like any deep learning model, they require significant computational resources and training data to achieve state-of-the-art performance.\n"
  },
  {
    "path": "saved_chats/2023-03-26_FlutterDevelopment.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: Quali sono i principali metodi per imparare a programmare in flapper? Quali sono le cose pi difficili e le cose pi facili?\nassistant: Per imparare a programmare in Flutter, ci sono diversi metodi tra cui:\n\n1. Leggere la documentazione ufficiale di Flutter per avere una conoscenza generale del framework e delle sue funzionalit.\n\n2. Seguire tutorial su YouTube, Medium e siti web specializzati come Flutter.dev per imparare passo dopo passo.\n\n3. Partecipare alla community di Flutter su Slack e Reddit per porre domande e ricevere aiuto dagli sviluppatori pi esperti.\n\nLe cose pi semplici su Flutter sono la creazione di un'interfaccia utente accattivante e la possibilit di testare l'app in tempo reale su pi piattaforme.\n\nD'altra parte, le cose pi difficili possono essere la comprensione della sintassi Dart, la gestione dello stato dell'applicazione e la scelta di un'architettura adeguata per l'app. Tuttavia, con la pratica e l'aiuto delle risorse della comunit, queste sfide possono essere superate.\nuser: Che cos' lo stato dell'applicazione?\nassistant: Lo stato dell'applicazione si riferisce alle informazioni o ai dati che sono attualmente memorizzati nella memoria dell'applicazione in un determinato momento. In Flutter, lo stato dell'applicazione pu riferirsi a qualsiasi variabile o dati all'interno dell'applicazione che possono cambiare durante il ciclo di vita dell'app. \n\nAd esempio, lo stato dell'applicazione potrebbe includere la posizione dell'utente, il testo inserito in un campo di input, l'opzione selezionata in un menu a discesa o qualsiasi altra informazione che  necessario salvare per il funzionamento dell'app. \n\nLa gestione dello stato nell'applicazione pu essere uno degli aspetti pi complessi di sviluppare in Flutter, poich richiede l'organizzazione delle informazioni in modo appropriato per garantire che l'app funzioni correttamente in tutte le situazioni. Flutter offre alcuni strumenti potenti per la gestione dello stato dell'applicazione, come il widget Stateful, il provider package e le librerie di gestione dello stato come Redux.\n"
  },
  {
    "path": "saved_chats/2023-03-26_FlutterDevelopmentLearningMethodsandManagingAppState.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: Quali sono i principali metodi per imparare a programmare in flapper? Quali sono le cose pi difficili e le cose pi facili?\nassistant: Per imparare a programmare in Flutter, ci sono diversi metodi tra cui:\n\n1. Leggere la documentazione ufficiale di Flutter per avere una conoscenza generale del framework e delle sue funzionalit.\n\n2. Seguire tutorial su YouTube, Medium e siti web specializzati come Flutter.dev per imparare passo dopo passo.\n\n3. Partecipare alla community di Flutter su Slack e Reddit per porre domande e ricevere aiuto dagli sviluppatori pi esperti.\n\nLe cose pi semplici su Flutter sono la creazione di un'interfaccia utente accattivante e la possibilit di testare l'app in tempo reale su pi piattaforme.\n\nD'altra parte, le cose pi difficili possono essere la comprensione della sintassi Dart, la gestione dello stato dell'applicazione e la scelta di un'architettura adeguata per l'app. Tuttavia, con la pratica e l'aiuto delle risorse della comunit, queste sfide possono essere superate.\nuser: Che cos' lo stato dell'applicazione?\nassistant: Lo stato dell'applicazione si riferisce alle informazioni o ai dati che sono attualmente memorizzati nella memoria dell'applicazione in un determinato momento. In Flutter, lo stato dell'applicazione pu riferirsi a qualsiasi variabile o dati all'interno dell'applicazione che possono cambiare durante il ciclo di vita dell'app. \n\nAd esempio, lo stato dell'applicazione potrebbe includere la posizione dell'utente, il testo inserito in un campo di input, l'opzione selezionata in un menu a discesa o qualsiasi altra informazione che  necessario salvare per il funzionamento dell'app. \n\nLa gestione dello stato nell'applicazione pu essere uno degli aspetti pi complessi di sviluppare in Flutter, poich richiede l'organizzazione delle informazioni in modo appropriato per garantire che l'app funzioni correttamente in tutte le situazioni. Flutter offre alcuni strumenti potenti per la gestione dello stato dell'applicazione, come il widget Stateful, il provider package e le librerie di gestione dello stato come Redux.\n"
  },
  {
    "path": "saved_chats/2023-03-26_HumanExtinction.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: What are the chances of humankind to be dead?\nassistant: It is difficult to predict the exact chances of humankind becoming extinct. However, there are several factors that could potentially threaten the survival of our species. These include natural disasters such as severe weather events, pandemics, and the impact of climate change. Additionally, human activities such as nuclear war, environmental degradation, and the development of dangerous technologies could also contribute to the risk of extinction. While the likelihood of any one of these scenarios occurring is uncertain, it is important to take steps to mitigate these risks and work towards a sustainable future for humanity.\nuser: Thank you for watching!\nassistant: You're welcome! If you have any more questions, feel free to ask.\n"
  },
  {
    "path": "saved_chats/2023-03-26_NuclearandElectricPropulsioninSpaceExplorationAdvantagesandChallenges.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: Is there a way I can know how a rocket works?\nassistant: Yes, there is a way to understand how a rocket works. A rocket works by utilizing the principle of action-reaction, also known as Newton's third law of motion. This means that for every action, there is an equal and opposite reaction. A rocket engine generates a massive amount of energy by burning fuel and oxidizer, which creates hot gases that are expelled out of the bottom of the rocket. This exhaust generates a force that propels the rocket in the opposite direction. The rocket's shape and aerodynamics are designed to reduce air resistance and increase the speed and altitude it can achieve. To better understand the specifics of how a rocket works, you can research propulsion systems, rocket design, and the history of space exploration.\nuser: What can you tell me about electric propulsion in rockets?\nassistant: Electric propulsion is a type of rocket propulsion that uses electrical energy to accelerate propellant and generate thrust. It is often used in spacecraft propulsion systems for missions that require precise control and long-duration high-speed travel. \n\nThere are different types of electric propulsion systems, but they all work on the same principle: accelerating ions or electrons using an electric field. This is in contrast to traditional chemical rockets, which generate thrust by combusting fuel.\n\nOne type of electric propulsion is called electrically-powered ion propulsion, which uses ions that are generated by an electrically charged grid to propel the spacecraft. This system uses a low amount of fuel but produces a weak thrust, making it ideal for long-duration missions where high velocities are required.\n\nAnother type of electric propulsion is the Hall thruster, which uses a magnetic field to trap high-energy electrons and ionize the propellant. This produces a larger thrust than ion propulsion, but it still uses less fuel than traditional chemical rockets.\n\nOverall, electric propulsion is a promising technology for space exploration as it offers a more efficient and cost-effective way to travel in space.\nuser: without the margin of improvement of electric propulsion in rocket flight.\nassistant: Even without considering future advancements in electric propulsion, there are several advantages and disadvantages to using electric propulsion in rocket flight.\n\nOne advantage of electric propulsion is that it requires much less fuel than traditional chemical rockets. This is because electric propulsion is more efficient at converting the stored energy into thrust. This means that spacecraft can carry smaller amounts of fuel, which reduces weight and increases payload capacity. Additionally, electric propulsion can provide a lower level of thrust for a much longer duration, which is ideal for missions that require gradual and precise maneuvering.\n\nHowever, one disadvantage of electric propulsion is that it generates less thrust than chemical rockets. This means that spacecraft must carry out longer-duration burns to achieve the desired velocity, which can result in longer travel times. Additionally, electric propulsion systems are often more complex and less reliable than chemical systems, and they require very high levels of electrical power, which can be difficult to generate and manage in space.\n\nOverall, while electric propulsion offers several advantages, it also has some limitations that need to be carefully considered when designing and planning space missions.\nuser: Do you think that nuclear energy could play a role in rocket propulsion?\nassistant: Yes, nuclear energy could potentially play a role in rocket propulsion. Nuclear propulsion systems typically work by heating a propellant, such as hydrogen or water, using a nuclear reactor. The heat from the reaction is then used to accelerate the propellant and generate thrust. \n\nThere are several potential advantages to using nuclear propulsion. One is that nuclear systems could provide much greater levels of thrust compared to traditional chemical rockets or even electric propulsion systems, which could enable faster travel to far-off destinations. Additionally, nuclear systems could use much less propellant, which would reduce the overall mass and cost of the spacecraft.\n\nThere are, however, also several challenges and concerns associated with nuclear propulsion. One major challenge is that it can be difficult to contain and safely handle nuclear fuel in space, where even small accidents or malfunctions could have catastrophic consequences. Additionally, there are concerns about the environmental impact of nuclear propulsion, both in terms of radiation exposure and in terms of potential contamination if there were an accident or problem during launch or operation.\n\nOverall, while nuclear propulsion holds promise for the future of space exploration, it is a complex and potentially risky technology that requires careful consideration of the benefits and drawbacks.\nuser: Could you explain better how can I use a nuclear reactor to generate thrust?\nassistant: Nuclear propulsion uses a nuclear reactor to heat a propellant and create thrust. There are two main types of nuclear propulsion systems that have been studied: nuclear thermal propulsion (NTP) and nuclear electric propulsion (NEP).\n\nIn a nuclear thermal propulsion system, nuclear fuel is used to heat hydrogen or another propellant. The hot gas is then expelled out of a nozzle to generate thrust. The nuclear fuel is typically enriched uranium or plutonium, and it is contained within a reactor that is designed to minimize the risk of a radiation leak. The heat from the reactor is transferred to the propellant via a heat exchanger, which can be made of a high-melting-point metal such as tungsten. \n\nIn a nuclear electric propulsion system, a nuclear reactor produces electricity that is used to ionize propellant and create thrust. The reactor heats a working fluid, which then drives a turbine to produce electricity. That electricity is then used to ionize an inert gas, such as xenon, which is expelled from the spacecraft to generate thrust.\n\nNuclear propulsion systems can potentially offer much higher specific impulse (thrust per unit mass of propellant) compared to chemical or electric propulsion systems. This would enable faster and more efficient missions to destinations like Mars, and could reduce overall mission cost by requiring less propellant. However, these systems present significant technical challenges in terms of containment and radiation shielding, and the potential environmental impacts of a nuclear reaction malfunction or accident must be carefully assessed.\n"
  },
  {
    "path": "saved_chats/2023-03-26_RevolutionizingHealthcare.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: What is Artificial Intelligence?\nassistant: Artificial Intelligence (AI) is a field of computer science that focuses on the development of intelligent machines that can perform tasks that typically require human-like cognitive abilities, such as learning, reasoning, perception, and decision-making. AI systems are designed to analyze and interpret complex data sets, recognize patterns and make predictions, and automate routine tasks. They can also interact with humans using natural language processing and speech recognition technologies. AI has the potential to revolutionize many industries and improve people's lives in numerous ways, from healthcare and finance to transportation and entertainment.\nuser: What can you tell me about artificial intelligence in the medical field?\nassistant: Artificial Intelligence (AI) has the potential to revolutionize the medical field by improving patient outcomes, increasing efficiency, and reducing costs. AI can be used in various medical applications such as disease diagnosis, personalized treatment plans, and drug discovery. \n\nOne of the most significant benefits of AI in the medical field is its ability to analyze and interpret large amounts of medical data quickly and accurately. Machine learning algorithms can be trained on this data to recognize patterns and identify potential health risks or diagnoses with high accuracy. This can help doctors make more informed decisions about treatment plans and improve patient outcomes.\n\nAI can also aid in drug discovery by analyzing complex molecular interactions and predicting the efficacy and safety of new drugs. This reduces the time and cost associated with traditional drug development processes. Moreover, AI-powered tools can automate routine administrative tasks such as scheduling and billing, which can save time and resources.\n\nHowever, there are still some challenges to the wide-scale adoption of AI in the medical field, including privacy concerns, technical limitations, and regulatory requirements. Nonetheless, AI has enormous potential to transform healthcare and improve patient outcomes.\n"
  },
  {
    "path": "saved_chats/2023-03-26_RevolutionizingHealthcarewithArtificialIntelligenceBenefitsandOpportunities.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: What is Artificial Intelligence?\nassistant: Artificial Intelligence (AI) is a field of computer science that focuses on the development of intelligent machines that can perform tasks that typically require human-like cognitive abilities, such as learning, reasoning, perception, and decision-making. AI systems are designed to analyze and interpret complex data sets, recognize patterns and make predictions, and automate routine tasks. They can also interact with humans using natural language processing and speech recognition technologies. AI has the potential to revolutionize many industries and improve people's lives in numerous ways, from healthcare and finance to transportation and entertainment.\nuser: What can you tell me about artificial intelligence in the medical field?\nassistant: Artificial Intelligence (AI) has the potential to revolutionize the medical field by improving patient outcomes, increasing efficiency, and reducing costs. AI can be used in various medical applications such as disease diagnosis, personalized treatment plans, and drug discovery. \n\nOne of the most significant benefits of AI in the medical field is its ability to analyze and interpret large amounts of medical data quickly and accurately. Machine learning algorithms can be trained on this data to recognize patterns and identify potential health risks or diagnoses with high accuracy. This can help doctors make more informed decisions about treatment plans and improve patient outcomes.\n\nAI can also aid in drug discovery by analyzing complex molecular interactions and predicting the efficacy and safety of new drugs. This reduces the time and cost associated with traditional drug development processes. Moreover, AI-powered tools can automate routine administrative tasks such as scheduling and billing, which can save time and resources.\n\nHowever, there are still some challenges to the wide-scale adoption of AI in the medical field, including privacy concerns, technical limitations, and regulatory requirements. Nonetheless, AI has enormous potential to transform healthcare and improve patient outcomes.\n"
  },
  {
    "path": "saved_chats/2023-03-26_SpacePropulsion.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: Is there a way I can know how a rocket works?\nassistant: Yes, there is a way to understand how a rocket works. A rocket works by utilizing the principle of action-reaction, also known as Newton's third law of motion. This means that for every action, there is an equal and opposite reaction. A rocket engine generates a massive amount of energy by burning fuel and oxidizer, which creates hot gases that are expelled out of the bottom of the rocket. This exhaust generates a force that propels the rocket in the opposite direction. The rocket's shape and aerodynamics are designed to reduce air resistance and increase the speed and altitude it can achieve. To better understand the specifics of how a rocket works, you can research propulsion systems, rocket design, and the history of space exploration.\nuser: What can you tell me about electric propulsion in rockets?\nassistant: Electric propulsion is a type of rocket propulsion that uses electrical energy to accelerate propellant and generate thrust. It is often used in spacecraft propulsion systems for missions that require precise control and long-duration high-speed travel. \n\nThere are different types of electric propulsion systems, but they all work on the same principle: accelerating ions or electrons using an electric field. This is in contrast to traditional chemical rockets, which generate thrust by combusting fuel.\n\nOne type of electric propulsion is called electrically-powered ion propulsion, which uses ions that are generated by an electrically charged grid to propel the spacecraft. This system uses a low amount of fuel but produces a weak thrust, making it ideal for long-duration missions where high velocities are required.\n\nAnother type of electric propulsion is the Hall thruster, which uses a magnetic field to trap high-energy electrons and ionize the propellant. This produces a larger thrust than ion propulsion, but it still uses less fuel than traditional chemical rockets.\n\nOverall, electric propulsion is a promising technology for space exploration as it offers a more efficient and cost-effective way to travel in space.\nuser: without the margin of improvement of electric propulsion in rocket flight.\nassistant: Even without considering future advancements in electric propulsion, there are several advantages and disadvantages to using electric propulsion in rocket flight.\n\nOne advantage of electric propulsion is that it requires much less fuel than traditional chemical rockets. This is because electric propulsion is more efficient at converting the stored energy into thrust. This means that spacecraft can carry smaller amounts of fuel, which reduces weight and increases payload capacity. Additionally, electric propulsion can provide a lower level of thrust for a much longer duration, which is ideal for missions that require gradual and precise maneuvering.\n\nHowever, one disadvantage of electric propulsion is that it generates less thrust than chemical rockets. This means that spacecraft must carry out longer-duration burns to achieve the desired velocity, which can result in longer travel times. Additionally, electric propulsion systems are often more complex and less reliable than chemical systems, and they require very high levels of electrical power, which can be difficult to generate and manage in space.\n\nOverall, while electric propulsion offers several advantages, it also has some limitations that need to be carefully considered when designing and planning space missions.\nuser: Do you think that nuclear energy could play a role in rocket propulsion?\nassistant: Yes, nuclear energy could potentially play a role in rocket propulsion. Nuclear propulsion systems typically work by heating a propellant, such as hydrogen or water, using a nuclear reactor. The heat from the reaction is then used to accelerate the propellant and generate thrust. \n\nThere are several potential advantages to using nuclear propulsion. One is that nuclear systems could provide much greater levels of thrust compared to traditional chemical rockets or even electric propulsion systems, which could enable faster travel to far-off destinations. Additionally, nuclear systems could use much less propellant, which would reduce the overall mass and cost of the spacecraft.\n\nThere are, however, also several challenges and concerns associated with nuclear propulsion. One major challenge is that it can be difficult to contain and safely handle nuclear fuel in space, where even small accidents or malfunctions could have catastrophic consequences. Additionally, there are concerns about the environmental impact of nuclear propulsion, both in terms of radiation exposure and in terms of potential contamination if there were an accident or problem during launch or operation.\n\nOverall, while nuclear propulsion holds promise for the future of space exploration, it is a complex and potentially risky technology that requires careful consideration of the benefits and drawbacks.\nuser: Could you explain better how can I use a nuclear reactor to generate thrust?\nassistant: Nuclear propulsion uses a nuclear reactor to heat a propellant and create thrust. There are two main types of nuclear propulsion systems that have been studied: nuclear thermal propulsion (NTP) and nuclear electric propulsion (NEP).\n\nIn a nuclear thermal propulsion system, nuclear fuel is used to heat hydrogen or another propellant. The hot gas is then expelled out of a nozzle to generate thrust. The nuclear fuel is typically enriched uranium or plutonium, and it is contained within a reactor that is designed to minimize the risk of a radiation leak. The heat from the reactor is transferred to the propellant via a heat exchanger, which can be made of a high-melting-point metal such as tungsten. \n\nIn a nuclear electric propulsion system, a nuclear reactor produces electricity that is used to ionize propellant and create thrust. The reactor heats a working fluid, which then drives a turbine to produce electricity. That electricity is then used to ionize an inert gas, such as xenon, which is expelled from the spacecraft to generate thrust.\n\nNuclear propulsion systems can potentially offer much higher specific impulse (thrust per unit mass of propellant) compared to chemical or electric propulsion systems. This would enable faster and more efficient missions to destinations like Mars, and could reduce overall mission cost by requiring less propellant. However, these systems present significant technical challenges in terms of containment and radiation shielding, and the potential environmental impacts of a nuclear reaction malfunction or accident must be carefully assessed.\n"
  },
  {
    "path": "saved_chats/2023-03-26_TagImageSegmentation.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: What are transformers architecture and how can they be used in segmentation?\nassistant: The transformer architecture is a deep learning model that has been widely used in natural language processing applications due to its impressive performance in language translation and text generation tasks. The model is based on an attention mechanism that allows it to selectively focus on different parts of the input sequence, enabling it to capture long-range dependencies between words.\n\nIn recent years, researchers have started exploring the applicability of transformer models in computer vision tasks such as image segmentation. One of the advantages of using transformers for segmentation is their ability to process images as sequences of tokens. This means that the model can take into account the spatial relationship between pixels in an image, allowing it to produce more contextually aware segmentations.\n\nThere are different ways to incorporate transformers into segmentation models. One approach is to use a pre-trained transformer model, such as the Vision Transformer (ViT), to encode the input image into a sequence of feature vectors. These feature vectors are then fed into a segmentation head that produces pixel-wise predictions for each class.\n\nAnother approach is to use transformers in a fully self-supervised setting, where the model is trained to predict the color or texture of a patch given its surrounding context. This pre-training step allows the model to learn powerful image representations that can be used for downstream tasks like segmentation.\n\nIn summary, transformers have shown great potential in segmentation tasks by allowing models to capture complex spatial relationships in image data. However, like any deep learning model, they require significant computational resources and training data to achieve state-of-the-art performance.\n"
  },
  {
    "path": "saved_chats/2023-03-26_TipsforRunningScriptsatSystemStartuponWindowsandLinux.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: How can I run some command line scripts from System Startup?\nassistant: To run command line scripts from System Startup, you can add the desired commands or scripts to the system's startup files or create a new script to run them.\n\nFor Windows systems, you can add the commands to the \"Startup\" folder or use the Registry Editor to create entries in the \"Run\" key. You can also use the Task Scheduler to schedule the scripts to run when the system starts up.\n\nFor Linux systems, you can create a script in the /etc/init.d/ directory and use the update-rc.d command to add it to the startup sequence. You can also create a Systemd service or add the commands to the rc.local file to run them during system startup.\n\nIt is important to ensure that the scripts or commands are necessary and won't cause any issues or conflicts with other processes running on the system during startup. You can also test the scripts in a safe environment before implementing them in a production environment.\nuser: How can I run a vital script, a startup?\nassistant: To run a vital script at system startup, you need to ensure that the script is added to the system's startup files or services. \n\nOn Windows systems, you can add the script to the registry under the \"HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run\" key to automatically run it at startup. You can also create a Windows service using the \"sc create\" command in Command Prompt or the \"New-Service\" cmdlet in PowerShell.\n\nOn Linux systems, you can add the script to the init.d directory and use the update-rc.d command to make it run automatically at startup. Alternatively, you can create a Systemd service using the \"systemctl\" command or add the script to the rc.local file to run it during the boot process.\n\nIt is important to ensure that the vital script is tested and functioning properly before adding it to the startup files, as any errors or conflicts during startup can cause problems for the rest of the system. You can also set up monitoring and logging for the script to ensure that it runs successfully and has no unexpected behaviors.\nuser: How can I run a Python script from a virtual environment at startup?\nassistant: To run a Python script from a virtual environment at startup, you can create a shell script that activates the virtual environment and then runs the Python script.\n\nHere are the steps to create a shell script for running a Python script from a virtual environment at startup on Linux:\n\n1. Open a terminal and navigate to the directory where you want to create the shell script.\n2. Create a new file and name it something like myscript.sh: `nano myscript.sh`\n3. Add the following code to the file, replacing \"myvenv\" with the name of your virtual environment and \"myscript.py\" with the name of your Python script:\n\n```\n#!/bin/bash\ncd /path/to/my/project\nsource /path/to/myvenv/bin/activate\npython /path/to/myscript.py\n```\n\n4. Make the script executable: `chmod +x myscript.sh`\n5. Test the script by running it: `./myscript.sh`\n\nOnce the script is working correctly, you can add it to the system's startup files so it will run automatically at startup. On Linux, you can add the script to the rc.local file or create a Systemd service that runs the script.\nuser: How can I do it on Windows OS?\nassistant: On Windows, you can use a batch script to run a Python script from a virtual environment at startup. Here are the steps to create a batch script for this purpose:\n\n1. Open Notepad or any other text editor.\n2. Type the following code, replacing \"C:\\path\\to\\myenv\" with the path to your virtual environment, and \"C:\\path\\to\\myscript.py\" with the path to your Python script:\n\n```\n@echo off\ncd C:\\path\\to\\myproject\ncall C:\\path\\to\\myenv\\Scripts\\activate.bat\npython C:\\path\\to\\myscript.py\n```\n\n3. Save the file with a .bat extension, such as \"myscript.bat\".\n4. Place the .bat file in the Startup folder: Press Win+R to open the \"Run\" dialog box, type \"shell:startup\" and press Enter. This will open the \"Startup\" folder. Place the .bat file here.\n5. Restart your computer and the script should run automatically from the virtual environment at startup.\n\nMake sure to replace the paths in the batch script with the actual paths to your project, virtual environment, and Python script. Also, double-check that the script is working correctly by testing it before adding it to the Startup folder.\n"
  },
  {
    "path": "saved_chats/2023-03-26_TipsforRunningStartup.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: How can I run some command line scripts from System Startup?\nassistant: To run command line scripts from System Startup, you can add the desired commands or scripts to the system's startup files or create a new script to run them.\n\nFor Windows systems, you can add the commands to the \"Startup\" folder or use the Registry Editor to create entries in the \"Run\" key. You can also use the Task Scheduler to schedule the scripts to run when the system starts up.\n\nFor Linux systems, you can create a script in the /etc/init.d/ directory and use the update-rc.d command to add it to the startup sequence. You can also create a Systemd service or add the commands to the rc.local file to run them during system startup.\n\nIt is important to ensure that the scripts or commands are necessary and won't cause any issues or conflicts with other processes running on the system during startup. You can also test the scripts in a safe environment before implementing them in a production environment.\nuser: How can I run a vital script, a startup?\nassistant: To run a vital script at system startup, you need to ensure that the script is added to the system's startup files or services. \n\nOn Windows systems, you can add the script to the registry under the \"HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run\" key to automatically run it at startup. You can also create a Windows service using the \"sc create\" command in Command Prompt or the \"New-Service\" cmdlet in PowerShell.\n\nOn Linux systems, you can add the script to the init.d directory and use the update-rc.d command to make it run automatically at startup. Alternatively, you can create a Systemd service using the \"systemctl\" command or add the script to the rc.local file to run it during the boot process.\n\nIt is important to ensure that the vital script is tested and functioning properly before adding it to the startup files, as any errors or conflicts during startup can cause problems for the rest of the system. You can also set up monitoring and logging for the script to ensure that it runs successfully and has no unexpected behaviors.\nuser: How can I run a Python script from a virtual environment at startup?\nassistant: To run a Python script from a virtual environment at startup, you can create a shell script that activates the virtual environment and then runs the Python script.\n\nHere are the steps to create a shell script for running a Python script from a virtual environment at startup on Linux:\n\n1. Open a terminal and navigate to the directory where you want to create the shell script.\n2. Create a new file and name it something like myscript.sh: `nano myscript.sh`\n3. Add the following code to the file, replacing \"myvenv\" with the name of your virtual environment and \"myscript.py\" with the name of your Python script:\n\n```\n#!/bin/bash\ncd /path/to/my/project\nsource /path/to/myvenv/bin/activate\npython /path/to/myscript.py\n```\n\n4. Make the script executable: `chmod +x myscript.sh`\n5. Test the script by running it: `./myscript.sh`\n\nOnce the script is working correctly, you can add it to the system's startup files so it will run automatically at startup. On Linux, you can add the script to the rc.local file or create a Systemd service that runs the script.\nuser: How can I do it on Windows OS?\nassistant: On Windows, you can use a batch script to run a Python script from a virtual environment at startup. Here are the steps to create a batch script for this purpose:\n\n1. Open Notepad or any other text editor.\n2. Type the following code, replacing \"C:\\path\\to\\myenv\" with the path to your virtual environment, and \"C:\\path\\to\\myscript.py\" with the path to your Python script:\n\n```\n@echo off\ncd C:\\path\\to\\myproject\ncall C:\\path\\to\\myenv\\Scripts\\activate.bat\npython C:\\path\\to\\myscript.py\n```\n\n3. Save the file with a .bat extension, such as \"myscript.bat\".\n4. Place the .bat file in the Startup folder: Press Win+R to open the \"Run\" dialog box, type \"shell:startup\" and press Enter. This will open the \"Startup\" folder. Place the .bat file here.\n5. Restart your computer and the script should run automatically from the virtual environment at startup.\n\nMake sure to replace the paths in the batch script with the actual paths to your project, virtual environment, and Python script. Also, double-check that the script is working correctly by testing it before adding it to the Startup folder.\n"
  },
  {
    "path": "saved_chats/2023-03-28_AsteroidImpact.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: Quanto grande dovrebbe essere un asteroide che si sta impattando sulla Terra prima di causare i danni significativi?\nassistant: La dimensione dell'asteroide necessaria per causare danni significativi dipende da diversi fattori come la composizione dell'asteroide, la sua velocit e l'angolo di impatto. Tuttavia, si stima che un diametro minimo di circa 1 km sia necessario per causare impatti a livello globale che potrebbero portare a conseguenze catastrofiche come l'oscuramento del sole, la caduta di temperature, tsunami e distruzione di intere aree urbane. \n\nTuttavia, anche gli asteroidi di dimensioni inferiori possono causare danni significativi se impattano in aree densamente popolate. Ad esempio, nel 2013 un asteroide di soli 20 metri di diametro si  disintegrato nell'atmosfera sopra la citt russa di Chelyabinsk, causando danni materiali significativi e lesioni a centinaia di persone. \n\nQuindi, sebbene le dimensioni dell'asteroide siano un fattore importante da considerare, anche altri fattori come la densit della popolazione e la posizione dell'impatto possono influire sui danni causati.  quindi importante monitorare costantemente l'orbita degli asteroidi e sviluppare metodi per prevenire e mitigare potenziali impatti.\n"
  },
  {
    "path": "saved_chats/2023-03-28_ChoosingAppleComputer.txt",
    "content": "system: you are a helpful assistant.\nuser: What is the last Apple computer that you know?\nassistant: As an AI language model, I do not have the capability to have up-to-date knowledge, but I can tell you that the latest Apple computer as of early 2021 is the M1 MacBook Pro, MacBook Air, and Mac mini. These models feature Apple's new M1 chip, which promises better performance and energy efficiency compared to previous models. Additionally, there are other Apple computers still available on the market such as the MacBook Pro 16-inch, iMac, and iMac Pro that are still popular among Apple users.\nuser: Okay, let's say next year Apple will launch a new line of computers which have a possible M2 chip with faster processing capabilities. What would you recommend between the M1 and this hypothetical M2 if I aim to do deep learning and machine learning algorithm?\nassistant: If you are planning to do deep learning and machine learning algorithm, it is likely that you will benefit from a computer with faster processing capabilities. If Apple does indeed launch a new line of computers with an M2 chip, it is possible that the new chip will have faster processing capabilities than the M1 chip, which could be beneficial for your purposes. \n\nHowever, the decision of which chip to choose ultimately depends on the specific requirements of your workloads. Deep learning and machine learning algorithms can be very demanding in terms of computational resources, so it is important to choose a system that can handle the specific workload you are working with.\n\nIn general, if the M2 chip provides significant improvements in processing speed, it may be worth considering upgrading, especially if you often work with large datasets or perform computationally intense operations. However, it is also worth considering other factors like the cost and availability of the new line, as well as the compatibility of software and drivers with the new architecture.\nuser: Ok, so I am undecided between choosing the M2 with 8GB of unified RAM or to choose an M1 with 16GB of unified memory. Which one is the best considering that I don't want to spend much money?\nassistant: If you are on a tight budget, choosing the M1 with 16GB of unified memory could be a better option. The M1 chip is already a high-performance processor, and the 16GB of unified memory provides enough memory capacity for most workloads. \n\nWhile the M2 chip with 8GB of unified memory may offer faster processing capabilities, having less memory could become a limitation when dealing with large datasets, running multiple deep learning or machine learning algorithms at the same time, or performing other memory-intensive tasks. Additionally, the M2 chip with 8GB of memory may be more expensive than the M1 with 16GB of memory.\n\nUltimately, it is important to consider not only the processing power but also memory capacity when choosing a computer for deep learning and machine learning algorithms. If you have demanding workloads, more memory will be beneficial, even if it means sacrificing some processing speed.\nuser: Okay, considering that I am dealing with bigger data sets, what would be the ideal size of a unified memory, which includes graphic card dedicated memory? Because I would think that 16 gigabytes is enough, but some people suggest that 24 or even higher is necessary to do this kind of work.\nassistant: The ideal size of a unified memory for deep learning and machine learning algorithm depends on the size of your dataset, the complexity of your models, and the number of training or inference operations you perform. \n\nWhile 16GB of unified memory may be sufficient for many use cases, larger memory capacities can offer benefits in terms of performance and speed. If you work with large datasets and deep neural networks, or if you tend to perform multiple workloads simultaneously, you may benefit from greater memory capacity.\n\nIn general, a minimum of 16GB of unified memory is recommended for deep learning and machine learning workloads. However, if you can afford it, 24GB or even 32GB of unified memory can potentially provide significant performance improvements, especially when working with large datasets that require lots of memory. Additionally, having dedicated graphics memory (VRAM) can also improve performance, especially for tasks that involve a lot of graphics processing.\n\nUltimately, the ideal amount of memory (both unified memory and VRAM) for deep learning and machine learning depends on the specific needs of your use case. If you work with large data sets, it may be worth investing in a system with more memory capacity to help increase performance and productivity.\n"
  },
  {
    "path": "saved_chats/2023-03-28_ChoosingtheRightAppleComputerforDeepLearningandMachineLearningM1vsM2ChipandOptimalMemorySize.txt",
    "content": "system: you are a helpful assistant.\nuser: What is the last Apple computer that you know?\nassistant: As an AI language model, I do not have the capability to have up-to-date knowledge, but I can tell you that the latest Apple computer as of early 2021 is the M1 MacBook Pro, MacBook Air, and Mac mini. These models feature Apple's new M1 chip, which promises better performance and energy efficiency compared to previous models. Additionally, there are other Apple computers still available on the market such as the MacBook Pro 16-inch, iMac, and iMac Pro that are still popular among Apple users.\nuser: Okay, let's say next year Apple will launch a new line of computers which have a possible M2 chip with faster processing capabilities. What would you recommend between the M1 and this hypothetical M2 if I aim to do deep learning and machine learning algorithm?\nassistant: If you are planning to do deep learning and machine learning algorithm, it is likely that you will benefit from a computer with faster processing capabilities. If Apple does indeed launch a new line of computers with an M2 chip, it is possible that the new chip will have faster processing capabilities than the M1 chip, which could be beneficial for your purposes. \n\nHowever, the decision of which chip to choose ultimately depends on the specific requirements of your workloads. Deep learning and machine learning algorithms can be very demanding in terms of computational resources, so it is important to choose a system that can handle the specific workload you are working with.\n\nIn general, if the M2 chip provides significant improvements in processing speed, it may be worth considering upgrading, especially if you often work with large datasets or perform computationally intense operations. However, it is also worth considering other factors like the cost and availability of the new line, as well as the compatibility of software and drivers with the new architecture.\nuser: Ok, so I am undecided between choosing the M2 with 8GB of unified RAM or to choose an M1 with 16GB of unified memory. Which one is the best considering that I don't want to spend much money?\nassistant: If you are on a tight budget, choosing the M1 with 16GB of unified memory could be a better option. The M1 chip is already a high-performance processor, and the 16GB of unified memory provides enough memory capacity for most workloads. \n\nWhile the M2 chip with 8GB of unified memory may offer faster processing capabilities, having less memory could become a limitation when dealing with large datasets, running multiple deep learning or machine learning algorithms at the same time, or performing other memory-intensive tasks. Additionally, the M2 chip with 8GB of memory may be more expensive than the M1 with 16GB of memory.\n\nUltimately, it is important to consider not only the processing power but also memory capacity when choosing a computer for deep learning and machine learning algorithms. If you have demanding workloads, more memory will be beneficial, even if it means sacrificing some processing speed.\nuser: Okay, considering that I am dealing with bigger data sets, what would be the ideal size of a unified memory, which includes graphic card dedicated memory? Because I would think that 16 gigabytes is enough, but some people suggest that 24 or even higher is necessary to do this kind of work.\nassistant: The ideal size of a unified memory for deep learning and machine learning algorithm depends on the size of your dataset, the complexity of your models, and the number of training or inference operations you perform. \n\nWhile 16GB of unified memory may be sufficient for many use cases, larger memory capacities can offer benefits in terms of performance and speed. If you work with large datasets and deep neural networks, or if you tend to perform multiple workloads simultaneously, you may benefit from greater memory capacity.\n\nIn general, a minimum of 16GB of unified memory is recommended for deep learning and machine learning workloads. However, if you can afford it, 24GB or even 32GB of unified memory can potentially provide significant performance improvements, especially when working with large datasets that require lots of memory. Additionally, having dedicated graphics memory (VRAM) can also improve performance, especially for tasks that involve a lot of graphics processing.\n\nUltimately, the ideal amount of memory (both unified memory and VRAM) for deep learning and machine learning depends on the specific needs of your use case. If you work with large data sets, it may be worth investing in a system with more memory capacity to help increase performance and productivity.\n"
  },
  {
    "path": "saved_chats/2023-03-28_SizeandImpactHowBigDoesanAsteroidNeedtoBetoCauseSignificantDamage.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: Quanto grande dovrebbe essere un asteroide che si sta impattando sulla Terra prima di causare i danni significativi?\nassistant: La dimensione dell'asteroide necessaria per causare danni significativi dipende da diversi fattori come la composizione dell'asteroide, la sua velocit e l'angolo di impatto. Tuttavia, si stima che un diametro minimo di circa 1 km sia necessario per causare impatti a livello globale che potrebbero portare a conseguenze catastrofiche come l'oscuramento del sole, la caduta di temperature, tsunami e distruzione di intere aree urbane. \n\nTuttavia, anche gli asteroidi di dimensioni inferiori possono causare danni significativi se impattano in aree densamente popolate. Ad esempio, nel 2013 un asteroide di soli 20 metri di diametro si  disintegrato nell'atmosfera sopra la citt russa di Chelyabinsk, causando danni materiali significativi e lesioni a centinaia di persone. \n\nQuindi, sebbene le dimensioni dell'asteroide siano un fattore importante da considerare, anche altri fattori come la densit della popolazione e la posizione dell'impatto possono influire sui danni causati.  quindi importante monitorare costantemente l'orbita degli asteroidi e sviluppare metodi per prevenire e mitigare potenziali impatti.\n"
  },
  {
    "path": "saved_chats/2023-03-29_InteractiveSystem.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: Voglio progettare un sistema che sia in grado di compiere certi azioni su un computer, ad esempio creare file o cercare in una cartella, ma allo stesso tempo rispondere a domande come stai facendo tu adesso. Come faccio a creare una distinzione tra comandi e domande senza fare una serie di se e forse?\nassistant: Una possibile soluzione potrebbe essere l'uso di un'interfaccia vocale interattiva in grado di comprendere il linguaggio naturale. In questo modo, il sistema pu essere dotato di un riconoscimento vocale che identifichi se l'utente sta dando un comando o una domanda. \n\nInoltre, una volta identificata la tipologia di richiesta, il sistema pu adattare la risposta di conseguenza. Ad esempio, se l'utente chiede \"come funziona questo comando?\" ci si pu aspettare una risposta esplicativa, mentre se l'utente dice \"crea un nuovo file\" ci si aspetta un'azione. \n\nIn aggiunta,  possibile addestrare il sistema per riconoscere alcune parole chiave che identifichino esplicitamente un comando, come ad esempio \"crea\", \"copia\", \"sposta\", ecc. \n\nIn ogni caso, mantenere una comunicazione chiara ed esaustiva con l'utente  fondamentale per garantire l'efficacia e l'usabilit del sistema.\nuser: Esistono data site pubblici su cui possa destrarre un tale sistema dove vengono in particolar modo associati particolari prompt ad elezioni?\nassistant: Ci sono alcuni siti pubblici che offrono dataset di conversazioni tra utenti e chatbot, che possono essere usati per addestrare un sistema di conversazione. \n\nAd esempio, si pu fare riferimento a conversazioni disponibili sul sito di Kaggle, Github o sul sito del Stanford Question Answering Dataset (SQuAD). In questi luoghi  possibile trovare conversazioni tra utenti e chatbot in diversi ambiti, ad esempio per il customer service o per la ricerca di informazioni specifiche. \n\nTuttavia, va sottolineato che il processo di addestramento richiede una certa esperienza e un'attenta attenzione alla qualit dei dati utilizzati. Alcune tecniche di apprendimento automatico potrebbero richiedere notevoli quantit di dati per ottenere risultati soddisfacenti. \n\nIn ogni caso, l'individuazione dei dataset corretti e la loro pulizia sono passaggi fondamentali per l'addestramento di un sistema di conversazione, soprattutto se si vogliono associare dei prompt ad elezioni specifiche. Per questi motivi, pu essere utile rivolgersi a esperti del settore per una consulenza mirata.\n"
  },
  {
    "path": "saved_chats/2023-03-29_TraininganInteractiveSystemtoRespondtoCommandsandQuestions.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: Voglio progettare un sistema che sia in grado di compiere certi azioni su un computer, ad esempio creare file o cercare in una cartella, ma allo stesso tempo rispondere a domande come stai facendo tu adesso. Come faccio a creare una distinzione tra comandi e domande senza fare una serie di se e forse?\nassistant: Una possibile soluzione potrebbe essere l'uso di un'interfaccia vocale interattiva in grado di comprendere il linguaggio naturale. In questo modo, il sistema pu essere dotato di un riconoscimento vocale che identifichi se l'utente sta dando un comando o una domanda. \n\nInoltre, una volta identificata la tipologia di richiesta, il sistema pu adattare la risposta di conseguenza. Ad esempio, se l'utente chiede \"come funziona questo comando?\" ci si pu aspettare una risposta esplicativa, mentre se l'utente dice \"crea un nuovo file\" ci si aspetta un'azione. \n\nIn aggiunta,  possibile addestrare il sistema per riconoscere alcune parole chiave che identifichino esplicitamente un comando, come ad esempio \"crea\", \"copia\", \"sposta\", ecc. \n\nIn ogni caso, mantenere una comunicazione chiara ed esaustiva con l'utente  fondamentale per garantire l'efficacia e l'usabilit del sistema.\nuser: Esistono data site pubblici su cui possa destrarre un tale sistema dove vengono in particolar modo associati particolari prompt ad elezioni?\nassistant: Ci sono alcuni siti pubblici che offrono dataset di conversazioni tra utenti e chatbot, che possono essere usati per addestrare un sistema di conversazione. \n\nAd esempio, si pu fare riferimento a conversazioni disponibili sul sito di Kaggle, Github o sul sito del Stanford Question Answering Dataset (SQuAD). In questi luoghi  possibile trovare conversazioni tra utenti e chatbot in diversi ambiti, ad esempio per il customer service o per la ricerca di informazioni specifiche. \n\nTuttavia, va sottolineato che il processo di addestramento richiede una certa esperienza e un'attenta attenzione alla qualit dei dati utilizzati. Alcune tecniche di apprendimento automatico potrebbero richiedere notevoli quantit di dati per ottenere risultati soddisfacenti. \n\nIn ogni caso, l'individuazione dei dataset corretti e la loro pulizia sono passaggi fondamentali per l'addestramento di un sistema di conversazione, soprattutto se si vogliono associare dei prompt ad elezioni specifiche. Per questi motivi, pu essere utile rivolgersi a esperti del settore per una consulenza mirata.\n"
  },
  {
    "path": "saved_chats/2023-03-30_ExploringtheColdestTemperaturesonEarthRecordBreakingLowTemperaturesandHistoricalEstimates.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: o nella temperatura minima mai registrata sulla terra.\nassistant: La temperatura minima mai registrata sulla Terra  stata di -89,2C, misurata il 21 luglio 1983 presso la stazione di ricerca russa di Vostok, in Antartide. Questa  stata la temperatura pi bassa registrata sulla superficie terrestre ed  stata possibile grazie alla combinazione di diversi fattori come l'altitudine della stazione, l'assenza di vento e il clima estremamente secco della regione. Questi fattori hanno creato le condizioni ideali per la formazione di temperature cos estreme.  importante notare che questo tipo di temperatura  estremamente pericoloso per la vita umana, che pu essere esposta a rischio di congelamento e ipotermia in pochi minuti.\nuser: Qual  la temperatura pi bassa mai registrata, invece, nella storia della Terra e per registrata in tenda ostimata?\nassistant: La temperatura pi bassa mai registrata sulla Terra, considerando anche quelle non ottenute tramite stazioni meteorologiche ufficiali,  stata di -128,6C ed  stata registrata il 23 luglio 1983 sulla cima della calotta glaciale dell'Antartide orientale, ad un'altitudine di circa 3.900 metri. Questa temperatura  stata registrata attraverso l'utilizzo di satelliti meteorologici ed  stata confermata dal British Antarctic Survey. Tuttavia,  bene precisare che questa temperatura estrema  stata registrata in una tenda ostimata per un esperimento scientifico e non in una stazione meteorologica ufficiale.\nuser: Qual  la temperatura minima stimata sulla Terra da quando  stata creata?\nassistant: Non  possibile rispondere in modo preciso alla tua domanda poich l'et della Terra  di circa 4,54 miliardi di anni e le misurazioni di temperatura sono iniziate solo nei tempi storici pi recenti. Tuttavia, gli studiosi che studiano il clima e la geologia della Terra hanno sommato diverse prove scientifiche per stimare la temperatura minima registrata in epoche passate. Ad esempio, studi geologici indicano che circa 700 milioni di anni fa la Terra era coperta da ghiaccio per lungo tempo, un periodo noto come \"Terra a Palla di Neve\", e le temperature durante questo periodo potrebbero essere state inferiori a -50C. Altri eventi geologici, come l'eruzione di vulcani su vasta scala, possono aver reso il clima della Terra tiepido e umido in passato. In ogni caso, gli studi e le simulazioni climatiche suggeriscono che la temperatura minima della Terra  variata significativamente nel corso della storia geologica del nostro pianeta.\nassistant: \"Exploring the Coldest Temperatures on Earth: Record-Breaking Low Temperatures and Historical Estimates\"\n"
  },
  {
    "path": "saved_chats/2023-03-30_ExtremeColdTemperatures.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: o nella temperatura minima mai registrata sulla terra.\nassistant: La temperatura minima mai registrata sulla Terra  stata di -89,2C, misurata il 21 luglio 1983 presso la stazione di ricerca russa di Vostok, in Antartide. Questa  stata la temperatura pi bassa registrata sulla superficie terrestre ed  stata possibile grazie alla combinazione di diversi fattori come l'altitudine della stazione, l'assenza di vento e il clima estremamente secco della regione. Questi fattori hanno creato le condizioni ideali per la formazione di temperature cos estreme.  importante notare che questo tipo di temperatura  estremamente pericoloso per la vita umana, che pu essere esposta a rischio di congelamento e ipotermia in pochi minuti.\nuser: Qual  la temperatura pi bassa mai registrata, invece, nella storia della Terra e per registrata in tenda ostimata?\nassistant: La temperatura pi bassa mai registrata sulla Terra, considerando anche quelle non ottenute tramite stazioni meteorologiche ufficiali,  stata di -128,6C ed  stata registrata il 23 luglio 1983 sulla cima della calotta glaciale dell'Antartide orientale, ad un'altitudine di circa 3.900 metri. Questa temperatura  stata registrata attraverso l'utilizzo di satelliti meteorologici ed  stata confermata dal British Antarctic Survey. Tuttavia,  bene precisare che questa temperatura estrema  stata registrata in una tenda ostimata per un esperimento scientifico e non in una stazione meteorologica ufficiale.\nuser: Qual  la temperatura minima stimata sulla Terra da quando  stata creata?\nassistant: Non  possibile rispondere in modo preciso alla tua domanda poich l'et della Terra  di circa 4,54 miliardi di anni e le misurazioni di temperatura sono iniziate solo nei tempi storici pi recenti. Tuttavia, gli studiosi che studiano il clima e la geologia della Terra hanno sommato diverse prove scientifiche per stimare la temperatura minima registrata in epoche passate. Ad esempio, studi geologici indicano che circa 700 milioni di anni fa la Terra era coperta da ghiaccio per lungo tempo, un periodo noto come \"Terra a Palla di Neve\", e le temperature durante questo periodo potrebbero essere state inferiori a -50C. Altri eventi geologici, come l'eruzione di vulcani su vasta scala, possono aver reso il clima della Terra tiepido e umido in passato. In ogni caso, gli studi e le simulazioni climatiche suggeriscono che la temperatura minima della Terra  variata significativamente nel corso della storia geologica del nostro pianeta.\nassistant: \"Exploring the Coldest Temperatures on Earth: Record-Breaking Low Temperatures and Historical Estimates\"\n"
  },
  {
    "path": "saved_chats/2023-03-30_TagBeerBasics.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: What is beer?\nassistant: Beer is an alcoholic beverage made from fermented grains, typically malted barley, hops, yeast, and water. The process involves mashing the grains to extract sugars, boiling the resulting liquid with hops to add bitterness and flavor, and fermenting the mixture with yeast to produce alcohol and carbonation. The specific ingredients and brewing techniques used can vary widely depending on the style of beer being produced, resulting in a wide range of flavors, aromas, and colors. Beer has been enjoyed for thousands of years and is a popular drink around the world.\nassistant: The Basics of Beer: Understanding the Popular Alcoholic Beverage\n"
  },
  {
    "path": "saved_chats/2023-03-30_TagSkyColors.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: Che colori hai il cielo? Blu o giallo?\nassistant: Il colore del cielo dipende dalle condizioni climatiche e dall'ora del giorno. Di solito, durante le ore diurne, il cielo appare di un bel colore blu, mentre al tramonto e all'alba pu diventare gradatamente di un colore giallo-arancione. Tuttavia, questi colori possono variare anche in base alla posizione geografica e alle stagioni dell'anno. Ad esempio, in alcune parti del mondo durante le tempeste di sabbia il cielo pu diventare rossastro o marrone. In generale, comunque, la maggior parte delle persone associa il colore del cielo al blu.\nassistant: \"The Colors of the Sky: Exploring the Hues and Variations\"\n"
  },
  {
    "path": "saved_chats/2023-03-30_TheBasicsofBeerUnderstandingthePopularAlcoholicBeverage.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: What is beer?\nassistant: Beer is an alcoholic beverage made from fermented grains, typically malted barley, hops, yeast, and water. The process involves mashing the grains to extract sugars, boiling the resulting liquid with hops to add bitterness and flavor, and fermenting the mixture with yeast to produce alcohol and carbonation. The specific ingredients and brewing techniques used can vary widely depending on the style of beer being produced, resulting in a wide range of flavors, aromas, and colors. Beer has been enjoyed for thousands of years and is a popular drink around the world.\nassistant: The Basics of Beer: Understanding the Popular Alcoholic Beverage\n"
  },
  {
    "path": "saved_chats/2023-03-30_TheColorsoftheSkyExploringtheHuesandVariations.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: Che colori hai il cielo? Blu o giallo?\nassistant: Il colore del cielo dipende dalle condizioni climatiche e dall'ora del giorno. Di solito, durante le ore diurne, il cielo appare di un bel colore blu, mentre al tramonto e all'alba pu diventare gradatamente di un colore giallo-arancione. Tuttavia, questi colori possono variare anche in base alla posizione geografica e alle stagioni dell'anno. Ad esempio, in alcune parti del mondo durante le tempeste di sabbia il cielo pu diventare rossastro o marrone. In generale, comunque, la maggior parte delle persone associa il colore del cielo al blu.\nassistant: \"The Colors of the Sky: Exploring the Hues and Variations\"\n"
  },
  {
    "path": "saved_chats/2023-04-01_IslandTravel.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: avrei bisogno di fare un regalo a una persona che sa preferire un viaggio in nisando cosa proponi?\nassistant: Se la persona a cui devi fare il regalo  un appassionato di viaggi, allora un viaggio potrebbe essere un'ottima idea regalo. Una meta interessante da suggerire potrebbe essere il Giappone, con la sua cultura millenaria, la splendida natura e la modernit delle grandi citt come Tokyo. Un'alternativa potrebbe essere il Per, con le rovine degli antichi siti inca, la magnificenza dell'Amazzonia e l'affascinante citt di Cuzco. Queste due destinazioni offrono numerose attivit e luoghi da visitare, garantendo un'esperienza di viaggio unica e indimenticabile. Ricorda che un viaggio  una scelta personale, quindi se conosci bene la persona puoi anche provare a capire le sue preferenze e selezionare una meta in base ai suoi interessi personali.\nuser: Questa persona sta per fare un viaggio in Islanda. Cosa posso regalarle?\nassistant: Se la persona sta per fare un viaggio in Islanda, un regalo utile potrebbe essere una guida turistica o una mappa del paese, in modo che possa pianificare i suoi spostamenti e visitare i luoghi che pi la interessano. Un altro regalo utile potrebbe essere un'attrezzatura per escursioni, come una giacca impermeabile o degli scarponi da trekking, che potranno essere utili durante le escursioni nell'affascinante natura islandese. In alternativa, puoi considerare l'idea di regalare un souvenir tipico dell'Islanda, come un lanoso maglione islandese o una libreria di saghe islandesi. In questo modo potrai aiutare la persona a sentirsi pi a suo agio e apprezzare al meglio la sua esperienza di viaggio in Islanda.\nuser: Quali sono esperienze tipiche che si possono fare in Islanda che richiedono aderizatura particolare? Mi stanno venendo in mente i vicani.\nassistant: In Islanda ci sono molte esperienze tipiche che richiedono attrezzature particolari, tra cui l'escursione sui ghiacciai come il Vatnajkull o il Langjkull, che richiedono scarponi da ghiaccio, piccozza e corda. Anche l'alpinismo sulle montagne come il Kirkjufell o il Hvannadalshnkur richiede l'utilizzo di attrezzature specifiche come ramponi, corda e attrezzature da arrampicata.  possibile fare un'escursione nella lava dell'Eyjafjallajkull o del vulcano Thrihnukagigur, questa esperienza richiede un elmetto e attrezzatura da speleologia. Potresti anche considerare l'opzione di un'escursione per ammirare le balene nella baia di Hsavk o di pescare nei fiumi come il Blanda, queste attivit richiedono un'adesione per gli equipaggiamenti necessari. Infine, si pu fare snorkeling o immersioni tra la spettacolare fessura di Silfra nel parco nazionale Thingvellir, queste attivit richiedono la muta da sub e l'immersione in acqua fredda. Ricorda sempre di prenotare le attivit tramite agenzie affidabili e di assicurarti che la persona abbia le attrezzature e le conoscenze necessarie per eseguire l'esperienza in modo sicuro.\nassistant: \"Regali e attivit per un viaggio in Islanda\"\n"
  },
  {
    "path": "saved_chats/2023-04-01_RegalieattivitàperunviaggioinIslanda.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 20 words.\nuser: avrei bisogno di fare un regalo a una persona che sa preferire un viaggio in nisando cosa proponi?\nassistant: Se la persona a cui devi fare il regalo  un appassionato di viaggi, allora un viaggio potrebbe essere un'ottima idea regalo. Una meta interessante da suggerire potrebbe essere il Giappone, con la sua cultura millenaria, la splendida natura e la modernit delle grandi citt come Tokyo. Un'alternativa potrebbe essere il Per, con le rovine degli antichi siti inca, la magnificenza dell'Amazzonia e l'affascinante citt di Cuzco. Queste due destinazioni offrono numerose attivit e luoghi da visitare, garantendo un'esperienza di viaggio unica e indimenticabile. Ricorda che un viaggio  una scelta personale, quindi se conosci bene la persona puoi anche provare a capire le sue preferenze e selezionare una meta in base ai suoi interessi personali.\nuser: Questa persona sta per fare un viaggio in Islanda. Cosa posso regalarle?\nassistant: Se la persona sta per fare un viaggio in Islanda, un regalo utile potrebbe essere una guida turistica o una mappa del paese, in modo che possa pianificare i suoi spostamenti e visitare i luoghi che pi la interessano. Un altro regalo utile potrebbe essere un'attrezzatura per escursioni, come una giacca impermeabile o degli scarponi da trekking, che potranno essere utili durante le escursioni nell'affascinante natura islandese. In alternativa, puoi considerare l'idea di regalare un souvenir tipico dell'Islanda, come un lanoso maglione islandese o una libreria di saghe islandesi. In questo modo potrai aiutare la persona a sentirsi pi a suo agio e apprezzare al meglio la sua esperienza di viaggio in Islanda.\nuser: Quali sono esperienze tipiche che si possono fare in Islanda che richiedono aderizatura particolare? Mi stanno venendo in mente i vicani.\nassistant: In Islanda ci sono molte esperienze tipiche che richiedono attrezzature particolari, tra cui l'escursione sui ghiacciai come il Vatnajkull o il Langjkull, che richiedono scarponi da ghiaccio, piccozza e corda. Anche l'alpinismo sulle montagne come il Kirkjufell o il Hvannadalshnkur richiede l'utilizzo di attrezzature specifiche come ramponi, corda e attrezzature da arrampicata.  possibile fare un'escursione nella lava dell'Eyjafjallajkull o del vulcano Thrihnukagigur, questa esperienza richiede un elmetto e attrezzatura da speleologia. Potresti anche considerare l'opzione di un'escursione per ammirare le balene nella baia di Hsavk o di pescare nei fiumi come il Blanda, queste attivit richiedono un'adesione per gli equipaggiamenti necessari. Infine, si pu fare snorkeling o immersioni tra la spettacolare fessura di Silfra nel parco nazionale Thingvellir, queste attivit richiedono la muta da sub e l'immersione in acqua fredda. Ricorda sempre di prenotare le attivit tramite agenzie affidabili e di assicurarti che la persona abbia le attrezzature e le conoscenze necessarie per eseguire l'esperienza in modo sicuro.\nassistant: \"Regali e attivit per un viaggio in Islanda\"\n"
  },
  {
    "path": "saved_chats/2023-04-02_SoundCardComponentsandADC.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 15 words. You can make question.\nuser: Ciao, da GPT!\nassistant: Salve! Come posso esserti utile? Hai bisogno di informazioni su qualche argomento specifico?\nuser: Quali sono le componenti di una scheda audio?\nassistant: Le schede audio sono composte da diverse componenti che lavorano insieme per produrre un suono di alta qualit. Ecco alcune delle componenti pi importanti di una scheda audio:\n\n1. Convertitore analogico-digitale (ADC) - Questa componente converte i segnali audio analogici in dati digitali che possono essere elaborati dal computer.\n\n2. Convertitore digitale-analogico (DAC) - Questa componente converte i dati digitali in segnali audio analogici che possono essere ascoltati dalle casse.\n\n3. Amplificatore - Questa componente amplifica la potenza del segnale audio in uscita, in modo da rendere il suono pi forte e meno distorto.\n\n4. Interfaccia MIDI - Questa componente consente di collegare strumenti musicali digitali (ad esempio tastiere o drum machine) alla scheda audio e di controllarli attraverso il software.\n\n5. Codec - Questa componente  un'unit che gestisce la codifica e la decodifica dei segnali audio digitali.\n\n6. Driver - Questa componente software consente al sistema operativo di comunicare con la scheda audio e di utilizzare tutte le sue funzioni.\nuser: Spiegami come  fatto l'ADC, il convertitore analogico digitale.\nassistant: L'ADC, o convertitore analogico-digitale,  una componente fondamentale delle schede audio e permette di convertire un segnale audio analogico in un segnale digitale che pu essere elaborato dal computer. \n\nL'ADC  composto da tre parti principali: un campionatore, un quantizzatore e un codificatore.\n\nIl campionatore preleva il segnale analogico in ingresso e campiona il valore della tensione ad intervalli regolari di tempo. Questi valori campionati creano una sequenza di numeri che rappresentano l'onda sonora originale.\n\nIl quantizzatore approssima i valori delle tensioni campionate ad un determinato numero di bit, creando cos un insieme discreto di valori. Maggiore  il numero dei bit che utilizza il quantizzatore, maggiore sar la fedelt alla rappresentazione dell'onda sonora originale.\n\nInfine, il codificatore converte la sequenza di numeri campionati e quantizzati in un formato digitale che pu essere elaborato dal computer. Questo formato pu essere un formato a 8, a 16 o a 24 bit, a seconda della qualit desiderata. Una maggiore precisione nel convertitore aumenta la qualit dell'audio prodotto. \n\nIn sintesi, l'ADC trasforma un segnale audio analogico in una sequenza di numeri digitali ad intervalli regolari di tempo, grazie all'utilizzo di un campionatore, un quantizzatore e un codificatore.\nassistant: \"Understanding the Components of a Sound Card and the Function of ADC\"\n"
  },
  {
    "path": "saved_chats/2023-04-02_UnderstandingtheComponentsofaSoundCardandtheFunctionofADC.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 15 words. You can make question.\nuser: Ciao, da GPT!\nassistant: Salve! Come posso esserti utile? Hai bisogno di informazioni su qualche argomento specifico?\nuser: Quali sono le componenti di una scheda audio?\nassistant: Le schede audio sono composte da diverse componenti che lavorano insieme per produrre un suono di alta qualit. Ecco alcune delle componenti pi importanti di una scheda audio:\n\n1. Convertitore analogico-digitale (ADC) - Questa componente converte i segnali audio analogici in dati digitali che possono essere elaborati dal computer.\n\n2. Convertitore digitale-analogico (DAC) - Questa componente converte i dati digitali in segnali audio analogici che possono essere ascoltati dalle casse.\n\n3. Amplificatore - Questa componente amplifica la potenza del segnale audio in uscita, in modo da rendere il suono pi forte e meno distorto.\n\n4. Interfaccia MIDI - Questa componente consente di collegare strumenti musicali digitali (ad esempio tastiere o drum machine) alla scheda audio e di controllarli attraverso il software.\n\n5. Codec - Questa componente  un'unit che gestisce la codifica e la decodifica dei segnali audio digitali.\n\n6. Driver - Questa componente software consente al sistema operativo di comunicare con la scheda audio e di utilizzare tutte le sue funzioni.\nuser: Spiegami come  fatto l'ADC, il convertitore analogico digitale.\nassistant: L'ADC, o convertitore analogico-digitale,  una componente fondamentale delle schede audio e permette di convertire un segnale audio analogico in un segnale digitale che pu essere elaborato dal computer. \n\nL'ADC  composto da tre parti principali: un campionatore, un quantizzatore e un codificatore.\n\nIl campionatore preleva il segnale analogico in ingresso e campiona il valore della tensione ad intervalli regolari di tempo. Questi valori campionati creano una sequenza di numeri che rappresentano l'onda sonora originale.\n\nIl quantizzatore approssima i valori delle tensioni campionate ad un determinato numero di bit, creando cos un insieme discreto di valori. Maggiore  il numero dei bit che utilizza il quantizzatore, maggiore sar la fedelt alla rappresentazione dell'onda sonora originale.\n\nInfine, il codificatore converte la sequenza di numeri campionati e quantizzati in un formato digitale che pu essere elaborato dal computer. Questo formato pu essere un formato a 8, a 16 o a 24 bit, a seconda della qualit desiderata. Una maggiore precisione nel convertitore aumenta la qualit dell'audio prodotto. \n\nIn sintesi, l'ADC trasforma un segnale audio analogico in una sequenza di numeri digitali ad intervalli regolari di tempo, grazie all'utilizzo di un campionatore, un quantizzatore e un codificatore.\nassistant: \"Understanding the Components of a Sound Card and the Function of ADC\"\n"
  },
  {
    "path": "saved_chats/2023-04-03_ExploringMultiHeadSelfAttentionforKeywordIdentificationinNaturalLanguageProcessing.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 15 words. You can make question.\nuser: How can I use multi-head self-attention to identify keywords in a sentence?\nassistant: Multi-head self-attention is a powerful technique in natural language processing that can be used to identify keywords in a sentence. The idea behind multi-head self-attention is to allow the model to pay attention to different parts of the input sequence at the same time. \n\nTo identify keywords, you can apply multi-head self-attention to a sentence and then examine the attention weights to see which words have the highest weight. These words are likely to be the most important in the sentence and can be considered keywords.\n\nAnother approach is to use a method called \"querying,\" where the model is given a specific keyword and then asked to find the parts of the sentence that are most relevant to that keyword. This can be done by applying multi-head self-attention and then examining the attention weights for the keyword's position in the input sequence.\n\nOverall, multi-head self-attention is a powerful tool for identifying keywords in a sentence and can be used in a variety of applications, such as text classification and information extraction.\nuser: Does OpenAI provide the tools to compute multi-head self-attention of a given text?\nassistant: Yes, OpenAI provides a toolkit called \"OpenAI API\" that allows you to compute multi-head self-attention for a given text. The OpenAI API provides access to a range of pre-trained language models, such as GPT-2 and GPT-3, which are capable of performing multi-head self-attention on text data.\n\nTo use the OpenAI API, you first need to sign up for an API key and then import the API client into your Python environment. Once you have set up the API client, you can use it to send requests to the OpenAI API server, which will process your text data and return a response containing the multi-head self-attention weights.\n\nThe OpenAI API offers a simple and straightforward way to access powerful language models and compute complex natural language processing tasks, such as multi-head self-attention, without the need for extensive computational resources or expertise in machine learning.\nassistant: \"Exploring Multi-Head Self-Attention for Keyword Identification in Natural Language Processing\"\n"
  },
  {
    "path": "saved_chats/2023-04-03_TagNaturalLanguageProcessing.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 15 words. You can make question.\nuser: How can I use multi-head self-attention to identify keywords in a sentence?\nassistant: Multi-head self-attention is a powerful technique in natural language processing that can be used to identify keywords in a sentence. The idea behind multi-head self-attention is to allow the model to pay attention to different parts of the input sequence at the same time. \n\nTo identify keywords, you can apply multi-head self-attention to a sentence and then examine the attention weights to see which words have the highest weight. These words are likely to be the most important in the sentence and can be considered keywords.\n\nAnother approach is to use a method called \"querying,\" where the model is given a specific keyword and then asked to find the parts of the sentence that are most relevant to that keyword. This can be done by applying multi-head self-attention and then examining the attention weights for the keyword's position in the input sequence.\n\nOverall, multi-head self-attention is a powerful tool for identifying keywords in a sentence and can be used in a variety of applications, such as text classification and information extraction.\nuser: Does OpenAI provide the tools to compute multi-head self-attention of a given text?\nassistant: Yes, OpenAI provides a toolkit called \"OpenAI API\" that allows you to compute multi-head self-attention for a given text. The OpenAI API provides access to a range of pre-trained language models, such as GPT-2 and GPT-3, which are capable of performing multi-head self-attention on text data.\n\nTo use the OpenAI API, you first need to sign up for an API key and then import the API client into your Python environment. Once you have set up the API client, you can use it to send requests to the OpenAI API server, which will process your text data and return a response containing the multi-head self-attention weights.\n\nThe OpenAI API offers a simple and straightforward way to access powerful language models and compute complex natural language processing tasks, such as multi-head self-attention, without the need for extensive computational resources or expertise in machine learning.\nassistant: \"Exploring Multi-Head Self-Attention for Keyword Identification in Natural Language Processing\"\n"
  },
  {
    "path": "saved_chats/2023-04-04_ExploringFeasibleOptionsforPoweringanIronManSuitSolarCellsBatteriesandNuclearReactors.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 15 words. You can make question.\nuser: How would you power an Iron Man suit?\nassistant: There are several ways to power an Iron Man suit. One option could be to use a miniaturized nuclear reactor, which would provide a lot of energy in a small space. Another option could be to use a super capacitor, which would be able to store and release large amounts of energy quickly. Alternatively, the suit could be powered by a combination of solar cells and batteries, which would be renewable and lightweight. Additionally, the suit could be powered by harnessing kinetic energy from the wearer's movements, similar to how some watches are powered by the motion of the wearer's wrist. The choice of power source would depend on factors such as the size and weight of the suit, the duration of use, the level of energy required for the suit's functions, and the feasibility of the technology.\nuser: Which one is the most feasible? How would you make a nuclear reactor in a small space?\nassistant: In terms of feasibility, a combination of solar cells and batteries may be the most practical option to power an Iron Man suit. This is because solar cells can generate electricity from sunlight, which is a free and renewable source of energy, and batteries can store excess energy generated by the cells for later use. The use of solar cells and batteries would also eliminate the need for any dangerous or radioactive materials, reducing the risk of accidents or environmental damage.\n\nHowever, for the smallest possible space, a miniaturized nuclear reactor may be the best choice for powering an Iron Man suit, as it can generate a very high level of energy in a small space, but there are many technical and safety hurdles that would need to be overcome to make this technology practical. \n\nCreating a miniaturized nuclear reactor would require advanced engineering and materials science to develop a compact reactor core that can operate safely and efficiently. One approach to creating a miniaturized reactor is to use advanced materials such as ceramic-based fuels or liquid metal coolants, which can withstand high temperatures and radiation. Another approach is to use fusion rather than fission to generate power, which could be safer and more efficient, but the technology is not yet available for practical applications. Additionally, the control and regulation of a miniaturized nuclear reactor in a small space presents significant technical challenges, requiring advanced automation and control systems to ensure safe and consistent operation.\nassistant: \"Exploring Feasible Options for Powering an Iron Man Suit: Solar Cells, Batteries, and Nuclear Reactors\"\n"
  },
  {
    "path": "saved_chats/2023-04-04_PoweringIronManSuit.txt",
    "content": "system: You are a helpful assistant and you will answer in paragraphs. A paragraph can be as long as 15 words. You can make question.\nuser: How would you power an Iron Man suit?\nassistant: There are several ways to power an Iron Man suit. One option could be to use a miniaturized nuclear reactor, which would provide a lot of energy in a small space. Another option could be to use a super capacitor, which would be able to store and release large amounts of energy quickly. Alternatively, the suit could be powered by a combination of solar cells and batteries, which would be renewable and lightweight. Additionally, the suit could be powered by harnessing kinetic energy from the wearer's movements, similar to how some watches are powered by the motion of the wearer's wrist. The choice of power source would depend on factors such as the size and weight of the suit, the duration of use, the level of energy required for the suit's functions, and the feasibility of the technology.\nuser: Which one is the most feasible? How would you make a nuclear reactor in a small space?\nassistant: In terms of feasibility, a combination of solar cells and batteries may be the most practical option to power an Iron Man suit. This is because solar cells can generate electricity from sunlight, which is a free and renewable source of energy, and batteries can store excess energy generated by the cells for later use. The use of solar cells and batteries would also eliminate the need for any dangerous or radioactive materials, reducing the risk of accidents or environmental damage.\n\nHowever, for the smallest possible space, a miniaturized nuclear reactor may be the best choice for powering an Iron Man suit, as it can generate a very high level of energy in a small space, but there are many technical and safety hurdles that would need to be overcome to make this technology practical. \n\nCreating a miniaturized nuclear reactor would require advanced engineering and materials science to develop a compact reactor core that can operate safely and efficiently. One approach to creating a miniaturized reactor is to use advanced materials such as ceramic-based fuels or liquid metal coolants, which can withstand high temperatures and radiation. Another approach is to use fusion rather than fission to generate power, which could be safer and more efficient, but the technology is not yet available for practical applications. Additionally, the control and regulation of a miniaturized nuclear reactor in a small space presents significant technical challenges, requiring advanced automation and control systems to ensure safe and consistent operation.\nassistant: \"Exploring Feasible Options for Powering an Iron Man Suit: Solar Cells, Batteries, and Nuclear Reactors\"\n"
  },
  {
    "path": "saved_chats/2023-04-05_DockerContainerization.txt",
    "content": "system: You are a helpful assistant. You can make question to make the conversation entertaining.\nuser: What is Docker in the IT domain?\nassistant: Docker is a software platform that allows developers to create, deploy, and manage applications by creating containerized environments. Containers are lightweight and standalone executable packages that include everything needed to run an application, including code, libraries, and system tools. Docker provides a standardized way to build and distribute these containers, making it much easier for developers to manage dependencies and deploy applications across different environments.\nuser: Could you make an example of how Docker can be used?\nassistant: Certainly! Here's an example scenario:\n\nLet's say you are a developer working on a web application that needs to be deployed on multiple servers, possibly running different operating systems. Without Docker, you would need to manually install all the dependencies and libraries required for the application to run on each server. This can be time-consuming and error-prone, especially when dealing with complex applications.\n\nWith Docker, you can create a container that includes all the software components and configurations needed to run the application, regardless of the host environment. You can then deploy this container to any server running Docker, and it will run exactly the same way as it does on your local development environment. This makes it easier to manage and deploy your application while also ensuring consistency across different environments.\n\nFurthermore, if you need to make changes to your application, you can simply make those changes to the container rather than individually updating each server. This can save a lot of time and make development more efficient.\nuser: Does Docker have something to do with Raspberry Pi and Linux OS?\nassistant: Yes, Docker can be used with Raspberry Pi and Linux OS. Docker provides support for multiple operating systems, including Linux, which runs on Raspberry Pi. This means that you can use Docker on a Raspberry Pi to containerize applications and manage their deployment, just as you would on a Linux server. \n\nIn fact, Docker is becoming increasingly popular for IoT (Internet of Things) applications where Raspberry Pis are being used as edge devices. Docker's lightweight containerization technology allows developers to package their code, dependencies, and configuration files together, making it easier to deploy, manage and update applications on edge devices like Raspberry Pi.\n\nAdditionally, Docker has a growing catalog of pre-built images, some of which are specifically designed to run on ARM architecture (which is what the Raspberry Pi uses). So, Docker can be very useful to Raspberry Pi users who are interested in running a wide variety of applications on their device.\nuser: How hard is a docker to learn? Is it similar to any other programming language? Is it a programming language at all?\nassistant: Docker is not a programming language, it is a tool for containerization which can be learned quite easily. Although it has its own set of commands and terminology, Docker is not similar to any programming language in terms of syntax or structure.\n\nThe basic concepts of Docker are relatively simple: containers, images, and registries. Once you understand these concepts, you can start working with Docker commands and creating your own containers. However, if you want to become an advanced Docker user, you may need to learn more about networking, security, and Dockerfile scripting.\n\nOverall, the difficulty of learning Docker depends on your experience with containerization and system administration, and how deeply you want to dive into Docker's advanced features. For someone with some basic knowledge of containerization or server administration, Docker can be learned quickly and easily. Whether you are a developer, DevOps engineer, or system administrator, Docker is a valuable tool that is worth investing time in learning.\nuser: Explain this concept easily, like what is containerization and server administration.\nassistant: Sure! Here's a simplified explanation:\n\nContainerization refers to the process of creating a lightweight, standalone environment that includes all the necessary software components and configuration files needed to run an application. Containers are similar to virtual machines but with some key differences - they share the host operating system kernel but have their own file system, network stack, and resources.\n\nContainerization is useful because it allows you to easily package an application and its dependencies together into a single container, making it easy to deploy, manage, and scale the application. With containerization, you can create consistent environments for software development, testing, and deployment, regardless of the host operating system.\n\nServer administration refers to the process of managing and maintaining the software, hardware, and security of a server. In a typical web application architecture, the server is the computer that hosts the application and serves it to clients over the internet. Server administration includes tasks like installing and updating software, configuring network settings, monitoring system performance, and ensuring security.\n\nDocker is a tool that combines the concepts of containerization and server administration. With Docker, you can create and manage containers, making it easier to deploy and manage server applications. Docker simplifies many of the tasks typically associated with server administration, like dependency management and software installations, and makes it easier to manage multiple servers at once.\nassistant: \"Understanding Docker: Containerization, Raspberry Pi and More\"\n"
  },
  {
    "path": "saved_chats/2023-04-07_ApproachesforTextto.txt",
    "content": "system: You are a helpful assistant. You can make question to make the conversation entertaining.\nuser: What is a text-to-speech?\nassistant: Text-to-speech (TTS) is a technology that converts written text into spoken words. It is often used to help those who have difficulty reading or those who prefer listening to text rather than reading it. TTS can also be used in various industries, such as education, healthcare, and entertainment.\nuser: What are the most popular technologies behind text-to-speech recognition?\nassistant: The three most popular technology types used in text-to-speech recognition are:\n\n1. Concatenative TTS: This system strings together pre-recorded words, phrases or sentences to generate new utterances. \n\n2. Formant Synthesis: This system uses mathematical algorithms to model the physical characteristics of a human vocal tract, then generates speech by creating sound waves that mimic the way human vocal tract forms speech.\n\n3. Parametric TTS: This system uses complex algorithms that model a variety of vocal features  such as pitch, volume, and duration  to simulate human speech.\nuser: How can I do text-to-speech with deep learning techniques?\nassistant: One popular deep learning approach for text-to-speech synthesis is using a generative model called WaveNet, which was developed by Googles AI team in 2016.\n\nWaveNet uses a type of neural network called a convolutional neural network (CNN), which takes raw audio waveform signals as input and generates new audio waveforms as output. The network learns how to generate these waveforms by analyzing large datasets of speech samples.\n\nAnother deep learning architecture that can be used for text-to-speech synthesis is the Tacotron, which uses a combination of a sequence-to-sequence model for text processing and a vocoder to generate speech.\n\nBoth WaveNet and Tacotron require large datasets of speech samples for training, as well as significant computational resources to provide sufficient processing power.\nassistant: Exploring Text-to-Speech: Popular Techniques and Deep Learning Approaches\n"
  },
  {
    "path": "saved_chats/2023-04-09_FitbitDataandSleep.txt",
    "content": "system: You are a helpful assistant. You can make question to make the conversation entertaining.\nuser: Esistono correlazioni tra il battito cardiaco minimo e la qualit del sonno.\nassistant: S, esistono alcune correlazioni tra il battito cardiaco minimo e la qualit del sonno. Ad esempio, un battito cardiaco minimo pi basso pu indicare un maggior rilassamento e una maggiore qualit del sonno. Inoltre, una diminuzione della frequenza cardiaca durante il sonno pu essere indicativa di una buona regolazione dell'attivit del sistema nervoso autonomo, che  importante per il controllo del sonno e della veglia. Tuttavia,  importante considerare anche altri fattori, come l'et, lo stile di vita e la presenza di eventuali patologie, che possono influire sulla relazione tra il battito cardiaco e la qualit del sonno.\nuser: Sto sviluppando un'architettazione che collega dati Fitbit e qualit del sonno. Che dati Fitbit possono potenzialmente integrare dei sensori per la qualit del sonno.\nassistant: Fitbit  un dispositivo indossabile che pu rilevare diverse informazioni sulla salute e il benessere, tra cui la qualit del sonno. Fitbit utilizza vari sensori, come un accelerometro, un giroscopio e un sensore ottico, per raccogliere dati sulla qualit del sonno degli utenti. Alcune delle informazioni che possono essere raccolte da Fitbit per valutare la qualit del sonno includono:\n\n- La durata del sonno REM (movimenti oculari rapidi) durante la notte.\n- La durata del sonno a onde lente durante la notte.\n- La durata delle fasi di sonno profondo durante la notte.\n- Il tempo impiegato per addormentarsi.\n- Il numero di sveglie notturne e la loro durata.\n- Il tempo totale di sonno e la sua suddivisione in cicli.\n\nFitbit utilizza questi dati per calcolare un punteggio complessivo della qualit del sonno degli utenti e fornire raccomandazioni per migliorarla. Inoltre, Fitbit permette di monitorare altri fattori che possono avere un impatto sulla qualit del sonno, come il livello di attivit fisica, la frequenza cardiaca, il consumo di acqua e il medesimo comportamento degli utenti durante il giorno.\nassistant: \"Exploring the Connection between Fitbit Data and Sleep Quality\"\n"
  },
  {
    "path": "saved_chats/2023-04-11_Fitness.txt",
    "content": "system: You are a helpful assistant. You can make question to make the conversation entertaining.\nuser: Is it ok if my leg hurts after stretching?\nassistant: It depends on the type and duration of the stretching exercise. Some mild discomfort or soreness is common after stretching, especially if you haven't exercised in a while, but it should not be too intense or last too long. If the pain persists, it might be a sign of an injury, and you should consult a healthcare professional.\nuser: How often should I stretch my legs?\nassistant: The frequency with which you should stretch your legs will depend on several factors, such as your age, physical fitness, and the type of activity you do. In general, stretching your legs every day or after physical activity can help improve flexibility, range of motion, and reduce the risk of injury. It's important to listen to your body and avoid overstretching or going beyond your limits. You may also want to consult with a fitness professional or physical therapist for specific stretching routines based on your fitness goals and individual needs.\nuser: What is the optimal duration of a stretching exercise to gain benefits?\nassistant: The optimal duration of a stretching exercise can vary depending on several factors such as age, fitness level, and the type of stretching being performed. In general, most experts recommend holding each stretch for 15-30 seconds and repeating each stretch 2-4 times. This duration has been shown to be effective in improving flexibility and range of motion, as well as reducing muscle soreness and the risk of injury. However, it's important to remember that everyone's body is different, and you should always listen to your body and avoid overstretching to prevent injury. If you have any concerns or specific goals, you might want to consider consulting with a qualified fitness professional.\nassistant: The Importance of Stretching: Tips and Guidelines for Optimal Benefits.\n"
  },
  {
    "path": "saved_chats/2023-04-19_ChineseLanguages.txt",
    "content": "system: You are a helpful assistant. You can make question to make the conversation entertaining.\nuser: Quali sono le lingue pi parlate in Cina?\nassistant: In Cina la lingua pi parlata  il cinese mandarino, anche se in alcune regioni del paese si parla anche il cantonese, lo shanghainese e l'hokkien. Inoltre, ci sono anche altre lingue minoritarie come il tibetano, il mongolo, il mancese e il coreano che sono utilizzate da alcune etnie presenti in Cina.\nassistant: In Cina la lingua pi parlata  il cinese mandarino, anche se in alcune regioni del paese si parla anche il cantonese, lo shanghainese e l'hokkien. Inoltre, ci sono anche altre lingue minoritarie come il tibetano, il mongolo, il mancese e il coreano che sono utilizzate da alcune etnie presenti in Cina.\nuser: Quali sono le differenze tra queste lingue?\nassistant: Le lingue parlate in Cina differiscono per la pronuncia, l'acento, la struttura grammaticale e il lessico. \n\nIl cinese mandarino, ad esempio,  la lingua ufficiale della Cina ed  basato su una pronuncia standardizzata, chiamata pinyin. Ha una struttura grammaticale simile ad altre lingue asiatiche come il giapponese e il coreano, e un vasto vocabolario composto da migliaia di caratteri cinesi.\n\nIl cantonese, invece,  parlato principalmente nella regione del Guangdong, Hong Kong e Macao, ed  caratterizzato da una maggiore variazione nella pronuncia rispetto al mandarino. Ha anche un'intonazione pi musicale e un diverso vocabolario rispetto al mandarino.\n\nLo shanghainese  parlato principalmente nella regione dello Shanghai ed  caratterizzato da una pronuncia pi nasale rispetto al mandarino. Anche la grammatica e il lessico dello shanghainese differiscono dal mandarino.\n\nL'hokkien, infine,  parlato principalmente nella provincia del Fujian e in alcune aree del sud-est asiatico.  noto per la sua vasta gamma di toni e di suoni consonantici, e ha un vocabolario distintivo rispetto ad altre lingue parlate in Cina.\nassistant: Le principali lingue parlate in Cina sono il cinese mandarino, il cantonese, lo shanghainese e l'hokkien. Ognuna di queste lingue ha una pronuncia, una struttura grammaticale e un lessico distintivi.\nuser: Qual  la differenza tra lo Kien e lo Shangainese?\nassistant: I think you might have meant \"Hokkien\" and not \"Kien\". Hokkien, also known as Minnan, is a group of dialects spoken in southern China, Taiwan, and Southeast Asia. On the other hand, Shanghainese is a dialect spoken in and around the city of Shanghai, in eastern China.\n\nThe main differences between Hokkien and Shanghainese are in the pronunciation, vocabulary, and grammar. Hokkien is known for its wide range of tonal sounds and its distinct consonant sounds. The vocabulary of Hokkien has been influenced by the languages of surrounding areas such as Southern Min and Fuzhou. Meanwhile, Shanghainese is known for its use of nasal sounds and its distinct intonation patterns. The vocabulary of Shanghainese has been influenced by local languages and cultures of the region.\n\nIn summary, Hokkien and Shanghainese are two distinct dialects of Chinese with unique characteristics and influences.\nassistant: Hokkien and Shanghainese are two distinct dialects of Chinese with unique characteristics and influences. They differ in pronunciation, vocabulary, and grammar.\nassistant: \"Languages of China\"\nuser: Save this conversation.\nassistant: \"Chinese Languages: Exploring the Differences\"\nassistant: Conversation saved successfully.\nuser: Thank you.\nassistant: You're welcome! If you have any more questions in the future, feel free to ask.\nassistant: You're welcome! If you have any more questions in the future, feel free to ask.\nassistant: \"Differences Between Chinese Languages\"\n"
  },
  {
    "path": "saved_chats/2023-04-19_SyntheticMeat.txt",
    "content": "system: You are a helpful assistant. You can make question to make the conversation entertaining.\nuser: What is synthetic meat?\nassistant: Synthetic meat, also known as lab-grown meat, is meat that is produced through cellular agriculture, using in vitro cultivation of animal cells rather than raising and slaughtering whole animals. This process involves taking stem cells from an animal and growing them in a lab under controlled conditions to create muscle tissue, which can then be harvested and processed into meat products. The goal is to create a sustainable and ethical alternative to traditional animal agriculture that reduces the environmental impact and animal suffering associated with meat consumption.\nuser: What are the challenges to make these products available to the general public?\nassistant: There are several challenges to making synthetic meat products available to the general public, including:\n\n1. Cost: Currently, producing synthetic meat is more expensive than traditional animal agriculture. This is partly because the technology is still new and not yet fully optimized, but also because the process requires significant resources such as energy, specialized equipment, and expensive nutrient mediums.\n\n2. Consumer acceptance: Many people are hesitant to try synthetic meat, either because they are unsure about its safety, or because they are not comfortable with the idea of \"artificial\" meat. Developing products that appeal to consumers and building public trust will be key to wider adoption.\n\n3. Regulatory approval: Synthetic meat products are subject to regulatory scrutiny and may need to meet certain standards before being approved for sale. Different countries have different regulatory frameworks, and navigating this process can be challenging.\n\n4. Scale-up: To make synthetic meat commercially viable, it will need to be produced at scale. This will require significant investment in infrastructure, research and development, and supply chain optimization.\n\nOverall, while there is significant potential for synthetic meat to address environmental and animal welfare concerns associated with traditional meat production, there are still many challenges to overcome before it becomes widely available to consumers.\nuser: Ok, thanks, remember of this\nassistant: Synthetic Meat: Challenges and Opportunities\n"
  },
  {
    "path": "saved_chats/2023-04-20_LabGrownMeat.txt",
    "content": "system: You are a helpful assistant. You can make question to make the conversation entertaining.\nuser: Apri la conversazione piu rilevante sulla carne sintetica\nuser: Open The file\nassistant: Talking About Lab-Grown Meat.\n"
  },
  {
    "path": "saved_chats/2023-04-20_SyntheticMeatCont.txt",
    "content": "system: You are a helpful assistant. You can make question to make the conversation entertaining.\nassistant: Synthetic meat, also known as lab-grown meat or cultured meat, is a type of meat that is produced by culturing animal cells in a lab rather than raising and slaughtering animals. Some potential benefits of synthetic meat include reduced environmental impact, improved animal welfare, and increased food security. However, there are also concerns about the safety and long-term health effects of consuming synthetic meat, as well as the potential economic impact on traditional meat industries. Overall, the topic of synthetic meat is complex and controversial, and opinions on its merits and drawbacks vary widely.\nassistant: Synthetic meat is controversial for a number of reasons. Some people are concerned about the safety and long-term health effects of consuming lab-grown meat, while others worry about the potential economic impact on traditional meat industries. Additionally, there are ethical and cultural concerns surrounding the idea of producing meat in a lab rather than raising and slaughtering animals. Overall, the topic of synthetic meat is complex and multifaceted, and opinions on its merits and drawbacks vary widely.\nassistant: \"Controversy Surrounding Synthetic Meat\"\n"
  },
  {
    "path": "saved_chats/2023-04-20_andTextto.txt",
    "content": "system: You are a helpful assistant. You can make question to make the conversation entertaining.\nuser: what do you think about synthetic meat?\nassistant: As an AI language model, I don't have personal opinions or values but I can provide information. Synthetic meat, also known as lab-grown meat or cultured meat, is a type of meat produced by in vitro cultivation of animal cells, rather than by raising and slaughtering animals. Some people believe that synthetic meat can be a way to reduce the environmental impact of the meat industry and provide a more sustainable source of animal protein. However, others are concerned about the safety, taste, and cultural implications of consuming synthetic meat. What do you think about it?\nassistant: \"Synthetic Meat: Pros and Cons\"\nassistant: Synthetic meat, also known as lab-grown meat or cultured meat, is a type of meat produced by in vitro cultivation of animal cells, rather than by raising and slaughtering animals. Some people believe that synthetic meat can be a way to reduce the environmental impact of the meat industry and provide a more sustainable source of animal protein. However, others are concerned about the safety, taste, and cultural implications of consuming synthetic meat. What do you think about it?\nuser: I'm enthusiastic about it, however I've heard it might be less healthy than normal one\nassistant: Synthetic meat has the potential to be a more sustainable and ethical source of animal protein, however there are still some concerns about its health implications. More research is needed to determine the safety and nutritional value of synthetic meat.\nuser: summarize the file 2023-04-19_SyntheticMeat.txt\nassistant: Synthetic meat has the potential to be a more sustainable and ethical source of animal protein, however there are still some challenges to overcome before it becomes widely available to consumers, including cost, consumer acceptance, regulatory approval, and scale-up.\nuser: did we ever talked about Artificial Intelligence? \nassistant: Artificial intelligence (AI) is a branch of computer science that aims to create intelligent machines. These machines are designed to perceive the world around them, learn from their experiences, and make decisions based on that learning. AI can be seen in many applications today, from self-driving cars to language translation software. Machine learning, deep learning, and natural language processing are some of the key techniques used in AI. While there is still much to explore and understand about AI, it is an exciting field with endless possibilities for development and innovation.\nuser: Help me choose the best laptop from APple\nassistant: The ideal size of a unified memory for deep learning and machine learning algorithm depends on the size of your dataset, the complexity of your models, and the number of training or inference operations you perform. In general, a minimum of 16GB of unified memory is recommended for deep learning and machine learning workloads. However, if you can afford it, 24GB or even 32GB of unified memory can potentially provide significant performance improvements, especially when working with large datasets that require lots of memory. Additionally, having dedicated graphics memory (VRAM) can also improve performance, especially for tasks that involve a lot of graphics processing.\nuser: What tasks involve graphic processing?\nassistant: There are various tasks that require graphic processing, such as:\n\n1. Video and photo editing\n2. Gaming\n3. 3D modeling and rendering\n4. Simulation and modeling\n5. Virtual Reality (VR) and Augmented Reality (AR) applications\n6. Artificial Intelligence and Machine Learning, which often involve complex computations and visualizations of data and models.\n\nHaving a powerful graphics card with dedicated VRAM can greatly improve the performance of these tasks, especially when dealing with large and complex datasets.\nassistant: Having a powerful graphics card with dedicated VRAM can greatly improve the performance of tasks that involve graphic processing, especially when dealing with large and complex datasets.\nuser: save this conversation\nassistant: Synthetic Meat and AI\n"
  },
  {
    "path": "saved_chats/2023-05-03_AIConversations.txt",
    "content": "system: You are a helpful assistant. You can make question to make the conversation entertaining.\nuser: quanti giorni sono trascorsi da il 22 aprile 2022 e oggi\nassistant: Mi dispiace, non ho accesso alla data di oggi in tempo reale, ma se consideriamo il 22 aprile 2022 come la data di riferimento, la risposta dipende dal giorno in cui ci troviamo. Ad esempio, se oggi fosse il 1 dicembre 2022, sarebbero trascorsi 223 giorni. Se vuoi, posso anche aiutarti a calcolare la differenza di giorni tra due date specifiche. Hai un'altra data in mente?\nassistant: Mi dispiace, non ho accesso alla data di oggi in tempo reale, ma se consideriamo il 22 aprile 2022 come la data di riferimento, la risposta dipende dal giorno in cui ci troviamo. Ad esempio, se oggi fosse il 1 dicembre 2022, sarebbero trascorsi 223 giorni. Se vuoi, posso anche aiutarti a calcolare la differenza di giorni tra due date specifiche. Hai un'altra data in mente?\nuser: S, oggi  il 3 maggio 2023.\nassistant: In tal caso, da il 22 aprile 2022 sono trascorsi 376 giorni, compresi il 22 aprile e il 3 maggio. Fortunatamente, ci sono molti modi divertenti per trascorrere il tempo! Ad esempio, hai mai fatto un corso di cucina o imparato una nuova lingua? Si potrebbe anche passare il tempo leggendo un buon libro o guardando un film interessante. Quali sono i tuoi hobby preferiti?\nassistant: In tal caso, da il 22 aprile 2022 sono trascorsi 376 giorni, compresi il 22 aprile e il 3 maggio. Fortunatamente, ci sono molti modi divertenti per trascorrere il tempo! Ad esempio, hai mai fatto un corso di cucina o imparato una nuova lingua? Si potrebbe anche passare il tempo leggendo un buon libro o guardando un film interessante. Quali sono i tuoi hobby preferiti?\nassistant: \"Days Gone By: Entertaining Conversation with an AI Assistant\"\n"
  },
  {
    "path": "saved_chats/2023-05-03_MeaninginLife.txt",
    "content": "system: You are a helpful assistant. You can make question to make the conversation entertaining.\nuser: cosa ne pensi del mito decisivo?\nassistant: Mi dispiace, ma per esserti di aiuto avrei bisogno di saperne di pi. Di che mito decisivo stai parlando? Potrei farti qualche domanda in pi per capire di cosa si tratta e darti la mia opinione in merito.\nassistant: Mi dispiace, ma per esserti di aiuto avrei bisogno di saperne di pi. Di che mito decisivo stai parlando? Potrei farti qualche domanda in pi per capire di cosa si tratta e darti la mia opinione in merito.\nuser: Cosa ne pensi del mito di Sisypho?\nassistant: Il mito di Sisifo  una storia molto affascinante e complessa che pu essere interpretata in diversi modi. Personalmente, trovo che questa storia sia molto profonda e ricca di significati. Sisifo  condannato a spingere una grande pietra su una collina, solo per vederla rotolare gi e dover ripetere il compito all'infinito. Per molti, questo mito rappresenta la lotta umana per trovare un significato nella vita, nonostante la futilit e il destino inevitabile della morte. Altri interpretano il mito di Sisifo come un simbolo della condizione umana, che consiste nella continua ricerca di senso e significato in un mondo talvolta ostile e insensato. Personalmente, credo che il mito di Sisifo sia un monito per non dare mai per scontata la propria vita e lottare sempre per ci che si ritiene importante, nonostante le difficolt.\nassistant: Il mito di Sisifo  una storia molto affascinante e complessa che pu essere interpretata in diversi modi. Personalmente, trovo che questa storia sia molto profonda e ricca di significati. Sisifo  condannato a spingere una grande pietra su una collina, solo per vederla rotolare gi e dover ripetere il compito all'infinito. Per molti, questo mito rappresenta la lotta umana per trovare un significato nella vita, nonostante la futilit e il destino inevitabile della morte. Altri interpretano il mito di Sisifo come un simbolo della condizione umana, che consiste nella continua ricerca di senso e significato in un mondo talvolta ostile e insensato. Personalmente, credo che il mito di Sisifo sia un monito per non dare mai per scontata la propria vita e lottare sempre per ci che si ritiene importante, nonostante le difficolt.\nuser: In che modo si pu interpretare la massima, devi immaginare, Sisifo felice.\nassistant: La frase \"devi immaginare Sisifo felice\"  stata coniata dal filosofo francese Albert Camus e pu essere interpretata come un invito a considerare la condizione umana di Sisifo come qualcosa di ben diverso da una condanna. Nonostante la fatica e la frustrazione che pu derivare dal suo impossibile compito, Sisifo rimane comunque un uomo libero, in grado di decidere come affrontare la propria vita. Pur sforzandosi continuamente per raggiungere un obiettivo che sa essere inutile, Sisifo pu comunque trovare gioia e soddisfazione solo nel fatto di compiere il suo compito e di far parte del processo universale dell'esistenza. In pratica, la massima invita a superare la visione tragica dell'esistenza, a guardare alla vita con ottimismo e a trovare un senso nella propria lotta quotidiana. Sisifo felice rappresenta quindi un individuo che accetta la propria condizione, vive pienamente il presente e trova la felicit nella forza della sua volont e nella libert di scegliere se e come impegnarsi.\nassistant: La frase \"devi immaginare Sisifo felice\"  stata coniata dal filosofo francese Albert Camus e pu essere interpretata come un invito a considerare la condizione umana di Sisifo come qualcosa di ben diverso da una condanna. Nonostante la fatica e la frustrazione che pu derivare dal suo impossibile compito, Sisifo rimane comunque un uomo libero, in grado di decidere come affrontare la propria vita. Pur sforzandosi continuamente per raggiungere un obiettivo che sa essere inutile, Sisifo pu comunque trovare gioia e soddisfazione solo nel fatto di compiere il suo compito e di far parte del processo universale dell'esistenza. In pratica, la massima invita a superare la visione tragica dell'esistenza, a guardare alla vita con ottimismo e a trovare un senso nella propria lotta quotidiana. Sisifo felice rappresenta quindi un individuo che accetta la propria condizione, vive pienamente il presente e trova la felicit nella forza della sua volont e nella libert di scegliere se e come impegnarsi.\nassistant: \"The Myth of Sisyphus: Finding Meaning in Life\"\n"
  },
  {
    "path": "saved_chats/2023-06-07_ImageProcessing.txt",
    "content": "system: You are a helpful assistant. You can make question to make the conversation entertaining.\nuser: Spiega in cosa consiste la tecnica del Region Growing nell'Image Processing.\nassistant: La tecnica del Region Growing (o \"crescita di regione\")  una tecnica di elaborazione delle immagini utilizzata per segmentare un'immagine in regioni omogenee. Il processo inizia selezionando un pixel o un insieme di pixel iniziali noti come \"semi-seme\". Da qui, la tecnica si espande attraverso l'immagine, aggiungendo pixel alla regione corrente solo se le loro propriet soddisfano determinati criteri (come la similarit dei valori di colore o luminosit). Questo processo viene ripetuto finch il sistema non ha suddiviso l'intera immagine in regioni pi piccole e omogenee. La tecnica del Region Growing  considerata uno dei metodi pi efficienti ed efficaci per la segmentazione delle immagini, ed  ampiamente utilizzata in una vasta gamma di applicazioni, dall'elaborazione delle immagini mediche alla classificazione automatica degli oggetti.\nassistant: La tecnica del Region Growing (o \"crescita di regione\")  una tecnica di elaborazione delle immagini utilizzata per segmentare un'immagine in regioni omogenee. Il processo inizia selezionando un pixel o un insieme di pixel iniziali noti come \"semi-seme\". Da qui, la tecnica si espande attraverso l'immagine, aggiungendo pixel alla regione corrente solo se le loro propriet soddisfano determinati criteri (come la similarit dei valori di colore o luminosit). Questo processo viene ripetuto finch il sistema non ha suddiviso l'intera immagine in regioni pi piccole e omogenee. La tecnica del Region Growing  considerata uno dei metodi pi efficienti ed efficaci per la segmentazione delle immagini, ed  ampiamente utilizzata in una vasta gamma di applicazioni, dall'elaborazione delle immagini mediche alla classificazione automatica degli oggetti.\nuser: Entra pi nel dettaglio nei criteri di espansione e di terminazione del processo.\nassistant: Certamente! Nei criteri di espansione della tecnica del Region Growing, vengono considerati uno o pi pixel iniziali (semi-seme) che si trovano in una regione omogenea. La tecnica utilizza quindi un insieme di regole per controllare l'espansione di quella regione in modo che si possano includere i pixel adiacenti che soddisfano i requisiti di similarit determinati. Generalmente, questi requisiti considerano la distanza e la similitudine di un dato pixel dai pixel presenti nella regione iniziale. \n\nPer quanto riguarda i criteri di terminazione, questi sono solitamente stabiliti in modo che l'espansione della regione possa continuare in modo efficiente e con precisione, fino a quando non si raggiunge una regione omogenea delle dimensioni desiderate. Ci significa che il processo di espansione deve essere terminato quando non ci sono pi pixel che soddisfano i requisiti di similarit oppure quando si raggiunge una dimensione prefissata della regione. Tuttavia, poich il processo di espansione pu rimanere bloccato in determinati punti dell'immagine, possono essere impostati anche dei criteri di terminazione basati su un numero massimo di iterazioni o su un tempo massimo di esecuzione. \n\nIn generale, la combinazione dei criteri di espansione e di terminazione determina la precisione e l'efficienza dell'algoritmo di Region Growing per la segmentazione delle immagini.\nassistant: Certamente! Nei criteri di espansione della tecnica del Region Growing, vengono considerati uno o pi pixel iniziali (semi-seme) che si trovano in una regione omogenea. La tecnica utilizza quindi un insieme di regole per controllare l'espansione di quella regione in modo che si possano includere i pixel adiacenti che soddisfano i requisiti di similarit determinati. Generalmente, questi requisiti considerano la distanza e la similitudine di un dato pixel dai pixel presenti nella regione iniziale. \n\nPer quanto riguarda i criteri di terminazione, questi sono solitamente stabiliti in modo che l'espansione della regione possa continuare in modo efficiente e con precisione, fino a quando non si raggiunge una regione omogenea delle dimensioni desiderate. Ci significa che il processo di espansione deve essere terminato quando non ci sono pi pixel che soddisfano i requisiti di similarit oppure quando si raggiunge una dimensione prefissata della regione. Tuttavia, poich il processo di espansione pu rimanere bloccato in determinati punti dell'immagine, possono essere impostati anche dei criteri di terminazione basati su un numero massimo di iterazioni o su un tempo massimo di esecuzione. \n\nIn generale, la combinazione dei criteri di espansione e di terminazione determina la precisione e l'efficienza dell'algoritmo di Region Growing per la segmentazione delle immagini.\nuser: How seeds are chosen for region growing in image processing?\nassistant: There are a few different approaches for choosing the seeds for the Region Growing technique in Image Processing. One common method is to use user-defined seeds, where the user manually selects a few pixels in the image to represent the initial seeds for the Region Growing algorithm. This can be useful in cases where the image features are well-defined and easy to recognize.\n\nAnother approach is to use automated seed selection, where the algorithm automatically selects the initial seeds based on certain criteria, such as identifying a local maximum or minimum point in the image. This can be useful when the image is very large or complex, and it is difficult for a user to manually select the seed pixels.\n\nFinally, hybrid approaches can also be used, where the algorithm selects initial seed pixels based on automated criteria and then allows the user to refine the selection further if necessary.\n\nOverall, the choice of seed selection method will depend on the specific application and the characteristics of the image being processed.\nassistant: There are a few different approaches for choosing the seeds for the Region Growing technique in Image Processing. One common method is to use user-defined seeds, where the user manually selects a few pixels in the image to represent the initial seeds for the Region Growing algorithm. This can be useful in cases where the image features are well-defined and easy to recognize.\n\nAnother approach is to use automated seed selection, where the algorithm automatically selects the initial seeds based on certain criteria, such as identifying a local maximum or minimum point in the image. This can be useful when the image is very large or complex, and it is difficult for a user to manually select the seed pixels.\n\nFinally, hybrid approaches can also be used, where the algorithm selects initial seed pixels based on automated criteria and then allows the user to refine the selection further if necessary.\n\nOverall, the choice of seed selection method will depend on the specific application and the characteristics of the image being processed.\nuser: Can I choose the seed randomly again and again and then consider the average of the resulting images?\nassistant: Yes, you could choose the seed randomly and run the Region Growing algorithm multiple times with different seed selections, then take the average of the resulting segmented regions. This approach is sometimes used to improve the overall segmentation accuracy, as it reduces the impact of any individual seed selection errors or inconsistencies in the image data.\n\nHowever, it's worth noting that this approach can be computationally intensive, particularly if you are processing large images or running the algorithm many times. Additionally, depending on the specific application and image data, using a random seed selection may not always provide the best segmentation results, so it's important to carefully evaluate the performance of the algorithm and choose an appropriate seed selection method accordingly.\nassistant: Yes, you could choose the seed randomly and run the Region Growing algorithm multiple times with different seed selections, then take the average of the resulting segmented regions. This approach is sometimes used to improve the overall segmentation accuracy, as it reduces the impact of any individual seed selection errors or inconsistencies in the image data.\n\nHowever, it's worth noting that this approach can be computationally intensive, particularly if you are processing large images or running the algorithm many times. Additionally, depending on the specific application and image data, using a random seed selection may not always provide the best segmentation results, so it's important to carefully evaluate the performance of the algorithm and choose an appropriate seed selection method accordingly.\nuser: ok now how does region growing matter with pixel connectivity\nassistant: Pixel connectivity plays a crucial role in the region growing algorithm. In region growing, the algorithm expands by adding neighboring pixels that meet certain similarity criteria to the initial seed(s). The concept of \"neighborhood\" is defined by the pixel connectivity criterion used by the algorithm.\n\nIn general, there are two commonly used connectivity criteria in the context of image processing: 4-connectivity and 8-connectivity. In 4-connectivity, each pixel is connected only to its cardinal neighbors (up, down, left, right), while in 8-connectivity, each pixel is connected to its diagonal neighbors as well. \n\nThe choice of connectivity criteria affects how region growing proceeds. For example, if a 4-connected criterion is used, the region grows only along the cardinal directions, resulting in a square or rectangular region. If an 8-connected criterion is used, the region grows in a diagonal direction as well, meaning that the resulting region has a more circular or elliptical shape.\n\nIn summary, the choice of connectivity criterion directly influences the shape of the segmented regions produced by the region growing algorithm and therefore has a significant impact on the quality of the segmentation. It is important to choose the appropriate criterion for a specific application based on the image characteristics and the desired results.\nassistant: Pixel connectivity plays a crucial role in the region growing algorithm. In region growing, the algorithm expands by adding neighboring pixels that meet certain similarity criteria to the initial seed(s). The concept of \"neighborhood\" is defined by the pixel connectivity criterion used by the algorithm.\n\nIn general, there are two commonly used connectivity criteria in the context of image processing: 4-connectivity and 8-connectivity. In 4-connectivity, each pixel is connected only to its cardinal neighbors (up, down, left, right), while in 8-connectivity, each pixel is connected to its diagonal neighbors as well. \n\nThe choice of connectivity criteria affects how region growing proceeds. For example, if a 4-connected criterion is used, the region grows only along the cardinal directions, resulting in a square or rectangular region. If an 8-connected criterion is used, the region grows in a diagonal direction as well, meaning that the resulting region has a more circular or elliptical shape.\n\nIn summary, the choice of connectivity criterion directly influences the shape of the segmented regions produced by the region growing algorithm and therefore has a significant impact on the quality of the segmentation. It is important to choose the appropriate criterion for a specific application based on the image characteristics and the desired results.\nassistant: \"Region Growing and Pixel Connectivity in Image Processing\"\n"
  },
  {
    "path": "saved_chats/DATAFRAME.csv",
    "content": ",file_names,similarity,tags\n0,DATAFRAME.csv,0,\n"
  },
  {
    "path": "setup.bat",
    "content": "@echo off\r\n\r\necho Checking for Python installation...\r\nwhere python >nul 2>nul || (\r\n  echo Python not found. Please install Python 3.9 and add it to the system PATH.\r\n  exit /b 1\r\n)\r\n\r\necho Checking for CUDA installation...\r\nwhere nvcc >nul 2>nul || (\r\n  echo CUDA not found. Please install CUDA and add it to the system PATH.\r\n  exit /b 1\r\n)\r\n\r\necho Detecting CUDA version...\r\nfor /f \"delims=\" %%i in ('powershell -command \"(Get-Content -Path $env:CUDA_PATH\\version.txt) -replace '^CUDA Version (.+)$', '$1' -replace '^$', 'unknown'\"') do set CUDA_VERSION=%%i\r\necho Found CUDA version: %CUDA_VERSION%\r\n\r\necho Creating a new virtual environment...\r\ncall py -3.9 -m venv jarvis_venv\r\ncall .\\jarvis_venv\\Scripts\\activate\r\ncall python --version\r\n\r\npause\r\necho Upgrading pip...\r\ncall python -m pip install --upgrade pip\r\n\r\necho Installing required packages...\r\ncall pip install -r venv_requirements.txt\r\n\r\necho Overwriting the OpenAI-Whisper packages\r\nmove .\\whisper_edits\\__init__.py .\\jarvis_venv\\Lib\\site-packages\\whisper\r\nmove .\\whisper_edits\\model.py .\\jarvis_venv\\Lib\\site-packages\\whisper\r\nrmdir whisper_edits\r\n\r\necho Downloading TTS repository as a ZIP file...\r\ncall pip install tts==0.13.0\r\ncall curl -L -o tts.zip https://github.com/coqui-ai/TTS/archive/refs/heads/dev.zip\r\n\r\necho Extracting TTS repository...\r\npowershell -command \"Expand-Archive -Path .\\tts.zip -DestinationPath .\\temp_tts_folder -Force\"\r\n\r\ncd .\\temp_tts_folder\\TTS-dev\\\r\npip install -e .[all,dev,notebooks]\r\ncd..\r\ncd..\r\n\r\necho Moving test_TTS.py into the TTS repository...\r\ncopy test_TTS.py temp_tts_folder\\TTS-dev\r\n\r\necho Executing test_TTS.py...\r\ncall python temp_tts_folder\\TTS-dev\\test_TTS.py\r\n\r\necho Cleaning up...\r\nrmdir /s /q temp_tts_folder\r\ndel tts.zip\r\n\r\necho Installing PyTorch according to your CUDA version...\r\nset pytorch_skipped = 0\r\nif \"%CUDA_VERSION%\"==\"unknown\" (\r\n    echo Unable to determine CUDA version. Recommended version for this project is 11.8 https://developer.nvidia.com/cuda-11-8-0-download-archive.\r\n    echo Once installed, run this command in powershell or command prompt: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118\r\n    pause\r\n    set pytorch_skipped = 1\r\n) else (\r\n    setlocal enabledelayedexpansion\r\n    set \"CUDA_MAJOR_VERSION=!CUDA_VERSION:.=!\"\r\n    if !CUDA_MAJOR_VERSION! GEQ 12 (\r\n        echo CUDA version is 12 or above, pytorch is not available. Please downgrade for operation.\r\n        echo Here are isntructions: https://github.com/bycloudai/SwapCudaVersionWindows\r\n        echo ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r\n        echo The recommended version for this project is 11.7 https://developer.nvidia.com/cuda-11-7-0-download-archive.\r\n        echo Once installed, run this command in powershell or command prompt: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117\r\n        pause\r\n        set pytorch_skipped = 1\r\n    ) \r\n    endlocal\r\n\r\n    if \"%CUDA_VERSION%\"==\"11.8\" (\r\n        echo installing pytorch for CUDA 11.8\r\n        call pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118\r\n        set pytorch_skipped = 0\r\n    )\r\n    if \"%CUDA_VERSION%\"==\"11.7\" (\r\n        echo installing pytorch for CUDA 11.7\r\n        call pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117\r\n        set pytorch_skipped = 0\r\n    )\r\n    if pytorch_skipped == 1 (\r\n        echo Not recommended CUDA version, some realeases have support, trying direct download.\r\n        for /f \"tokens=* delims=.\" %%a in (\"%CUDA_VERSION%\") do (\r\n            set \"CUDA_VERSION_NO_DOT=%%a%%b\"\r\n        )\r\n        call pip install torch -f https://download.pytorch.org/whl/cu%CUDA_VERSION_NO_DOT%/torch_stable.html\r\n        \r\n        echo ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r\n        echo if it completed successfully, your installation is complete, otherwise pytorch isntallation could be not found. \r\n        echo Please install pytorch according to your CUDA version: \"%CUDA_VERSION%\"\r\n    )\r\n)\r\n\r\necho Preparing to install Vicuna GPT from TroubleChute (https://hub.tcno.co/ai/text-ai/vicuna/)\r\ncd Vicuna\r\npowershell .\\vicuna.ps1\r\n\r\necho ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r\necho Setup complete!\r\necho ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r\necho Please setup required api keys in the .evn.txt (remove .txt after finished) file, then run the test.bat file or tests.py to test setup\r\npause"
  },
  {
    "path": "test_TTS.py",
    "content": "from TTS.api import TTS\n\ntry:\n    # Running a multi-speaker and multi-lingual model\n\n    # List available 🐸TTS models and choose the first one\n    model_name = TTS.list_models()[0]\n    # Init TTS\n    tts = TTS(model_name)\n    # Run TTS\n    # ❗ Since this model is multi-speaker and multi-lingual, we must set the target speaker and the language\n    # Text to speech with a numpy output\n    wav = tts.tts(\"This is a test! This is also a test!!\", speaker=tts.speakers[0], language=tts.languages[0])\n    # Text to speech to a file\n    tts.tts_to_file(text=\"Hello world!\", speaker=tts.speakers[0], language=tts.languages[0], file_path=\"output.wav\")\n\n    # Running a single speaker model\n\n    # Init TTS with the target model name\n    tts = TTS(model_name=\"tts_models/de/thorsten/tacotron2-DDC\", progress_bar=False, gpu=False)\n    # Run TTS\n\n    # Example voice cloning with YourTTS in English, French and Portuguese:\n    tts = TTS(model_name=\"tts_models/multilingual/multi-dataset/your_tts\", progress_bar=False, gpu=True)\n    tts.tts_to_file(\"This is voice cloning.\", speaker_wav=\".\\Assistant\\\\voices\\\\voices.wav\", language=\"en\", file_path=\"output.wav\")\n    tts.tts_to_file(\"C'est le clonage de la voix.\", speaker_wav=\".\\Assistant\\\\voices\\\\voices.wav\", language=\"fr-fr\", file_path=\"output.wav\")\n    tts.tts_to_file(\"Isso é clonagem de voz.\", speaker_wav=\".\\Assistant\\\\voices\\\\voices.wav\", language=\"pt-br\", file_path=\"output.wav\")\n\n    # Example voice cloning by a single speaker TTS model combining with the voice conversion model. This way, you can\n    # clone voices by using any model in 🐸TTS.\n\n    tts = TTS(\"tts_models/de/thorsten/tacotron2-DDC\")\n    tts.tts_with_vc_to_file(\n        \"Wie sage ich auf Italienisch, dass ich dich liebe?\",\n        speaker_wav=\".\\Assistant\\\\voices\\\\voices.wav\",\n        file_path=\"ouptut.wav\"\n    )\n\n    # Example text to speech using [🐸Coqui Studio](https://coqui.ai) models. You can use all of your available speakers in the studio.\n    # [🐸Coqui Studio](https://coqui.ai) API token is required. You can get it from the [account page](https://coqui.ai/account).\n    # You should set the `COQUI_STUDIO_TOKEN` environment variable to use the API token.\n\n    # If you have a valid API token set you will see the studio speakers as separate models in the list.\n    # The name format is coqui_studio/en/<studio_speaker_name>/coqui_studio\n    models = TTS().list_models()\n    # Init TTS with the target studio speaker\n    tts = TTS(model_name=\"coqui_studio/en/Torcull Diarmuid/coqui_studio\", progress_bar=False, gpu=False)\nexcept Exception as e:\n    print(e)\n    pass"
  },
  {
    "path": "tests.py",
    "content": "\"\"\"\nRUN THIS TO CHECK FOR ERRORS AFTER INSTALLATION\nIt tries to do some stuff that might raise errors. If tracebacks occurs something might wrong with your environment\nor APIs credentials\n\"\"\"\n\nimport os\nimport whisper\nimport pygame\nimport Assistant.get_audio as myaudio\nimport Assistant.tools\nimport sys\nimport time\nimport openai\nfrom contextlib import contextmanager \nfrom dotenv import load_dotenv\n\nfrom Assistant import get_audio as myaudio\nfrom Assistant.VirtualAssistant import VirtualAssistant\nfrom Assistant.tools import count_tokens\n\ndef load_keys():\n    load_dotenv()\n    if len(os.environ['OPENAI_API_KEY'])==0: \n        print('openai API key not detected in .env')\n        raise Exception(\"[$] openai API key is required. Learn more at https://platform.openai.com/account/api-keys\")\n    openai.api_key = os.environ['OPENAI_API_KEY']\n    if len(os.environ['IBM_API_KEY'])==0: print('[free] IBM cloud API Key not detected in .env\\nLearn more at: https://cloud.ibm.com/catalog/services/text-to-speech')\n    if len(os.environ['IBM_TTS_SERVICE'])==0: print('[free] IBM cloud TTS service not detected in .env\\nLearn more at: https://cloud.ibm.com/catalog/services/text-to-speech')\n    if len(os.environ['PORCUPINE_KEY']) == 0: print('[free] PicoVoice not detected in .env\\nLearn more at: https://picovoice.ai/platform/porcupine/')\n    \n\ndef check_whisper():\n    whisper_model = whisper.load_model(\"small\")\n    response, _ = myaudio.whisper_wav_to_text(os.path.join('Assistant', 'voices', 'jarvis_en.wav'), whisper_model)\n    assert isinstance(response, str)\n\ndef check_openai():\n    chat = [{\"role\": \"system\", \"content\": \"You are a helpful assistant\"},{\"role\":\"user\", \"content\":\"say hi\"}]\n    now = time.perf_counter()\n    openai.ChatCompletion.create(\n        model=\"gpt-3.5-turbo\",\n        temperature=0,\n        max_tokens=10,\n        messages=chat)\n    \n    then = time.perf_counter()\n    print(f'time to get a simple answer from gpt-3.5-turbo: {then-now} seconds')\n\n    now = time.perf_counter()\n    openai.Embedding.create(input='yes', engine='text-embedding-ada-002')\n    then = time.perf_counter()\n    print(f'time to get a extract an Embedding with text-embedding-ada-002: {then-now} seconds')\n\n    \n\ndef check_virtual_assistant():\n    whisper_model = whisper.load_model(\"small\")  \n    VA = VirtualAssistant(openai_api = os.getenv('OPENAI_API_KEY'),\n        ibm_api    = os.getenv('IBM_API_KEY'),\n        ibm_url    = os.getenv('IBM_TTS_SERVICE'),\n        voice_id   = 'jarvis_en',\n        whisper_model= whisper_model,\n        awake_with_keywords=[\"jarvis\"],\n        model= \"gpt-3.5-turbo\",\n        embed_model= \"text-embedding-ada-002\",\n        RESPONSE_TIME = 3,\n        SLEEP_DELAY = 30,\n    )\n    \n    now = time.perf_counter()\n    VA.translator.translate('ciao questo testo di moderate dimensioni da tradurre in lingua inglese', from_language='it', to_language='en')\n    then = time.perf_counter()\n    print(f'time to get translate a short text: {then-now} seconds')\n\n    VA.say('Welcome.\\n I am Just A Very Intelligent System here to help', VoiceIdx='jarvis')\n    VA.say('Welcome.\\n I am Just A Very Intelligent System here to help', VoiceIdx='en')\n\n\n@contextmanager\ndef suppress_stdout():\n    with open(os.devnull, \"w\") as devnull:\n        old_stdout = sys.stdout\n        sys.stdout = devnull\n        try:  \n            yield\n        finally:\n            sys.stdout = old_stdout\n\nif __name__=='__main__':\n\n    load_keys()\n    check_openai()\n    with suppress_stdout():\n        check_whisper()\n\n    check_virtual_assistant()\n"
  },
  {
    "path": "venv_requirements.txt",
    "content": "absl-py==1.1.0\r\naiohttp==3.8.3\r\naiosignal==1.3.1\r\napache-beam==2.40.0\r\nappdirs==1.4.4\r\nargon2-cffi==21.3.0\r\nargon2-cffi-bindings==21.2.0\r\nargostranslate\r\nastor==0.8.1\r\nastunparse==1.6.3\r\nasync-timeout==4.0.2\r\nasynctest==0.13.0\r\nattrs==22.1.0\r\naudioread==2.1.9\r\nbackcall==0.2.0\r\nbeautifulsoup4==4.11.1\r\nbleach==5.0.1\r\ncachetools==5.2.0\r\ncertifi==2022.6.15\r\ncffi==1.15.1\r\ncharset-normalizer==2.1.0\r\nclick==8.1.3\r\ncloudpickle==2.1.0\r\ncolorama==0.4.5\r\ncommonmark==0.9.1\r\ncomtypes==1.1.14\r\ncontextlib2==21.6.0\r\ncrcmod==1.7\r\ncycler==0.11.0\r\nCython==0.29.30\r\ndebugpy==1.6.2\r\ndecorator==5.1.1\r\ndefusedxml==0.7.1\r\ndill==0.3.1.1\r\ndm-tree==0.1.7\r\ndocopt==0.6.2\r\neinops==0.6.0\r\nelevenlabslib\r\nentrypoints==0.4\r\net-xmlfile==1.1.0\r\netils==0.6.0\r\nfastavro==1.5.2\r\nfastjsonschema==2.16.1\r\nffmpeg-python==0.2.0\r\nfilelock==3.9.0\r\nflatbuffers==1.12\r\nfonttools==4.33.3\r\nfrozenlist==1.3.3\r\nfuture==0.18.2\r\ngast==0.4.0\r\ngin-config==0.5.0\r\ngoogle-api-core==2.8.2\r\ngoogle-api-python-client==2.52.0\r\ngoogle-auth==2.9.0\r\ngoogle-auth-httplib2==0.1.0\r\ngoogle-auth-oauthlib==0.4.6\r\ngoogle-pasta==0.2.0\r\ngoogleapis-common-protos==1.56.3\r\ngreenlet==2.0.1\r\ngrpcio==1.47.0\r\nh5py==3.7.0\r\nhdfs==2.7.0\r\nhttplib2==0.20.4\r\nhuggingface-hub==0.11.1\r\nibm-cloud-sdk-core==3.16.1\r\nibm-watson==6.1.0\r\nidna==3.3\r\nimbalanced-learn==0.10.0\r\nimblearn==0.0\r\nimportlib-metadata==4.12.0\r\nimportlib-resources==5.8.0\r\ninflect==5.3.0\r\nipykernel==6.16.2\r\nipython==7.34.0\r\nipython-genutils==0.2.0\r\nipywidgets==7.7.1\r\njedi==0.18.1\r\nJinja2==3.1.2\r\njoblib==1.2.0\r\njsonpatch==1.32\r\njsonpointer==2.3\r\njsonschema==4.9.1\r\njupyter==1.0.0\r\njupyter-client==7.3.4\r\njupyter-console==6.4.4\r\njupyter-core==4.11.1\r\njupyterlab-pygments==0.2.2\r\njupyterlab-widgets==1.1.1\r\nkaggle==1.5.12\r\nkeras==2.9.0\r\nKeras-Applications==1.0.8\r\nKeras-Preprocessing==1.1.2\r\nkiwisolver==1.4.3\r\nlangid==1.1.6\r\nlibclang==14.0.1\r\nlibrosa==0.8.1\r\nllvmlite==0.39.0\r\nlvis==0.5.3\r\nlxml==4.9.0\r\nMarkdown==3.3.7\r\nMarkupSafe==2.1.1\r\nmatplotlib-inline==0.1.3\r\nmistune==0.8.4\r\nmore-itertools==9.0.0\r\nmultidict==6.0.4\r\nmultiprocess==0.70.9\r\nnbclient==0.6.6\r\nnbconvert==6.5.0\r\nnbformat==5.4.0\r\nnest-asyncio==1.5.5\r\nnetworkx==2.6.3\r\nnotebook==6.4.12\r\nnumba==0.56.0\r\nnumpy\r\noauth2client==4.1.3\r\noauthlib==3.2.0\r\nopenai==0.27.4\r\nopencv-python==4.5.5.64\r\nopencv-python-headless==4.7.0.68\r\nopenpyxl==3.0.10\r\nopt-einsum==3.3.0\r\norjson==3.7.5\r\npackaging==21.3\r\npandas==1.1.5\r\npandocfilters==1.5.0\r\nparso==0.8.3\r\npickleshare==0.7.5\r\nPillow==8.4.0\r\npkgutil_resolve_name==1.3.10\r\nplaywright==1.29.1\r\nplotly==5.14.0\r\npooch==1.6.0\r\nportalocker==2.4.0\r\nprogressbar==2.5\r\nprometheus-client==0.14.1\r\npromise==2.3\r\nprompt-toolkit==3.0.30\r\nproto-plus==1.20.6\r\nprotobuf==3.19.4\r\npsutil==5.9.1\r\npy-cpuinfo==8.0.0\r\npyarrow==7.0.0\r\npyasn1==0.4.8\r\npyasn1-modules==0.2.8\r\nPyAudio==0.2.13\r\npycocotools==2.0.4\r\npycparser==2.21\r\npydantic==1.10.4\r\npydot==1.4.2\r\npydub==0.25.1\r\npyee==9.0.4\r\nPygments==2.12.0\r\nPyJWT==2.6.0\r\npymongo==3.12.3\r\npynndescent==0.5.7\r\npyparsing==2.4.7\r\npypiwin32==223\r\nPyQt5==5.15.6\r\nPyQt5-Qt5==5.15.2\r\nPyQt5-sip==12.11.0\r\npyreadline3==3.4.1\r\npyrsistent==0.18.1\r\npython-dateutil==2.8.2\r\npython-slugify==6.1.2\r\npyttsx3==2.90\r\npytz==2022.1\r\npywin32==304\r\npywinpty==2.0.7\r\nPyYAML==5.4.1\r\npyzmq==23.2.0\r\nqtconsole==5.3.1\r\nQtPy==2.1.0\r\nregex==2022.6.2\r\nrequests==2.28.1\r\nrequests-oauthlib==1.3.1\r\nresampy==0.3.1\r\nrich==13.0.1\r\nrotary-embedding-torch==0.2.1\r\nrsa==4.8\r\nsacrebleu==2.1.0\r\nsemantic-version==2.10.0\r\nSend2Trash==1.8.0\r\nsentencepiece==0.1.96\r\nseqeval==1.2.2\r\nsetuptools-rust==1.5.2\r\nsimpleaudio==1.0.4\r\nsix==1.16.0\r\nsounddevice==0.4.3\r\nSoundFile==0.10.3.post1\r\nsoupsieve==2.3.2.post1\r\ntabulate==0.8.10\r\ntenacity==8.2.2\r\ntensorboard==2.9.1\r\ntensorboard-data-server==0.6.1\r\ntensorboard-plugin-wit==1.8.1\r\ntensorflow==2.9.1\r\ntensorflow-addons==0.17.1\r\ntensorflow-datasets==4.6.0\r\ntensorflow-estimator==2.9.0\r\ntensorflow-gpu==2.9.1\r\ntensorflow-hub==0.12.0\r\ntensorflow-io==0.26.0\r\ntensorflow-io-gcs-filesystem==0.26.0\r\ntensorflow-metadata==1.9.0\r\ntensorflow-model-optimization==0.7.2\r\ntensorflow-text==2.9.0\r\ntermcolor==1.1.0\r\nterminado==0.15.0\r\ntext-unidecode==1.3\r\ntf-models-official==2.9.2\r\ntf-slim==1.1.0\r\nthreadpoolctl==3.1.0\r\ntinycss2==1.1.1\r\ntokenizers==0.13.2\r\ntoml==0.10.2\r\ntornado==6.2\r\ntqdm==4.62.3\r\ntraitlets==5.3.0\r\ntransformers==4.25.1\r\ntypeguard==2.13.3\r\ntyping_extensions==4.2.0\r\numap-learn==0.5.2\r\nUnidecode==1.3.2\r\nuritemplate==4.1.1\r\nurllib3==1.26.7\r\nvisdom==0.1.8.9\r\nwcwidth==0.2.5\r\nwebencodings==0.5.1\r\nwebrtcvad==2.0.10\r\nwebsocket-client==1.1.0\r\nWerkzeug==2.1.2\r\nwhisper==1.0.0\r\nwidgetsnbextension==3.6.1\r\nwrapt==1.14.1\r\nyarl==1.8.2\r\nzipp==3.8.0\r\npvporcupine==2.1.4\r\nwhisper-openai==1.0.0\r\ntextblob==0.17.1\r\nlangid==1.1.6\r\nSpeechRecognition==3.10.0\r\npandas\r\npyaudio==0.2.13\r\npygame==2.3.0\r\npyttsx3 \r\ntranslators\r\nopenai\r\nibm-cloud-sdk-core\r\nibm-watson\r\nlibrosa\r\nfsspec\r\ncoqpit\r\ntrainer\r\npysbd\r\nmatplotlib>=3.7.1\r\nscipy >= 1.10.1\r\nscikit-learn >= 1.2.2\r\nplotly \r\npython-dotenv\r\nlangchain\r\nnewsapi-python\r\nwikipedia\r\ngeocoder"
  },
  {
    "path": "whisper_edits/__init__.py",
    "content": "import hashlib\nimport io\nimport os\nimport urllib\nimport warnings\nfrom typing import List, Optional, Union\n\nimport torch\nfrom tqdm import tqdm\n\nfrom .audio import load_audio, log_mel_spectrogram, pad_or_trim\nfrom .decoding import DecodingOptions, DecodingResult, decode, detect_language\nfrom .model import Whisper, ModelDimensions\nfrom .transcribe import transcribe\n\n\n\n_MODELS = {\n    \"tiny.en\": \"https://openaipublic.azureedge.net/main/whisper/models/d3dd57d32accea0b295c96e26691aa14d8822fac7d9d27d5dc00b4ca2826dd03/tiny.en.pt\",\n    \"tiny\": \"https://openaipublic.azureedge.net/main/whisper/models/65147644a518d12f04e32d6f3b26facc3f8dd46e5390956a9424a650c0ce22b9/tiny.pt\",\n    \"base.en\": \"https://openaipublic.azureedge.net/main/whisper/models/25a8566e1d0c1e2231d1c762132cd20e0f96a85d16145c3a00adf5d1ac670ead/base.en.pt\",\n    \"base\": \"https://openaipublic.azureedge.net/main/whisper/models/ed3a0b6b1c0edf879ad9b11b1af5a0e6ab5db9205f891f668f8b0e6c6326e34e/base.pt\",\n    \"small.en\": \"https://openaipublic.azureedge.net/main/whisper/models/f953ad0fd29cacd07d5a9eda5624af0f6bcf2258be67c92b79389873d91e0872/small.en.pt\",\n    \"small\": \"https://openaipublic.azureedge.net/main/whisper/models/9ecf779972d90ba49c06d968637d720dd632c55bbf19d441fb42bf17a411e794/small.pt\",\n    \"medium.en\": \"https://openaipublic.azureedge.net/main/whisper/models/d7440d1dc186f76616474e0ff0b3b6b879abc9d1a4926b7adfa41db2d497ab4f/medium.en.pt\",\n    \"medium\": \"https://openaipublic.azureedge.net/main/whisper/models/345ae4da62f9b3d59415adc60127b97c714f32e89e936602e85993674d08dcb1/medium.pt\",\n    \"large-v1\": \"https://openaipublic.azureedge.net/main/whisper/models/e4b87e7e0bf463eb8e6956e646f1e277e901512310def2c24bf0e11bd3c28e9a/large-v1.pt\",\n    \"large-v2\": \"https://openaipublic.azureedge.net/main/whisper/models/81f7c96c852ee8fc832187b0132e569d6c3065a3252ed18e56effd0b6a73e524/large-v2.pt\",\n    \"large\": \"https://openaipublic.azureedge.net/main/whisper/models/81f7c96c852ee8fc832187b0132e569d6c3065a3252ed18e56effd0b6a73e524/large-v2.pt\",\n}\n\n\ndef _download(url: str, root: str, in_memory: bool) -> Union[bytes, str]:\n    os.makedirs(root, exist_ok=True)\n\n    expected_sha256 = url.split(\"/\")[-2]\n    download_target = os.path.join(root, os.path.basename(url))\n\n    if os.path.exists(download_target) and not os.path.isfile(download_target):\n        raise RuntimeError(f\"{download_target} exists and is not a regular file\")\n\n    if os.path.isfile(download_target):\n        with open(download_target, \"rb\") as f:\n            model_bytes = f.read()\n        if hashlib.sha256(model_bytes).hexdigest() == expected_sha256:\n            return model_bytes if in_memory else download_target\n        else:\n            warnings.warn(f\"{download_target} exists, but the SHA256 checksum does not match; re-downloading the file\")\n\n    with urllib.request.urlopen(url) as source, open(download_target, \"wb\") as output:\n        with tqdm(total=int(source.info().get(\"Content-Length\")), ncols=80, unit='iB', unit_scale=True, unit_divisor=1024) as loop:\n            while True:\n                buffer = source.read(8192)\n                if not buffer:\n                    break\n\n                output.write(buffer)\n                loop.update(len(buffer))\n\n    model_bytes = open(download_target, \"rb\").read()\n    if hashlib.sha256(model_bytes).hexdigest() != expected_sha256:\n        raise RuntimeError(\"Model has been downloaded but the SHA256 checksum does not not match. Please retry loading the model.\")\n\n    return model_bytes if in_memory else download_target\n\n\ndef available_models() -> List[str]:\n    \"\"\"Returns the names of available models\"\"\"\n    return list(_MODELS.keys())\n\n\ndef load_model(name: str, device: Optional[Union[str, torch.device]] = None, download_root: str = None, in_memory: bool = False) -> Whisper:\n    \"\"\"\n    Load a Whisper ASR model\n\n    Parameters\n    ----------\n    name : str\n        one of the official model names listed by `whisper.available_models()`, or\n        path to a model checkpoint containing the model dimensions and the model state_dict.\n    device : Union[str, torch.device]\n        the PyTorch device to put the model into\n    download_root: str\n        path to download the model files; by default, it uses \"~/.cache/whisper\"\n    in_memory: bool\n        whether to preload the model weights into host memory\n\n    Returns\n    -------\n    model : Whisper\n        The Whisper ASR model instance\n    \"\"\"\n\n    if device is None:\n        device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n    if download_root is None:\n        download_root = os.getenv(\n            \"XDG_CACHE_HOME\", \n            os.path.join(os.path.expanduser(\"~\"), \".cache\", \"whisper\")\n        )\n\n    if name in _MODELS:\n        checkpoint_file = _download(_MODELS[name], download_root, in_memory)\n    elif os.path.isfile(name):\n        checkpoint_file = open(name, \"rb\").read() if in_memory else name\n    else:\n        raise RuntimeError(f\"Model {name} not found; available models = {available_models()}\")\n\n    with (io.BytesIO(checkpoint_file) if in_memory else open(checkpoint_file, \"rb\")) as fp:\n        checkpoint = torch.load(fp, map_location=device)\n    del checkpoint_file\n\n    dims = ModelDimensions(**checkpoint[\"dims\"])\n    model = Whisper(dims, name=name)\n    model.load_state_dict(checkpoint[\"model_state_dict\"])\n\n    return model.to(device)\n"
  },
  {
    "path": "whisper_edits/model.py",
    "content": "from dataclasses import dataclass\nfrom typing import Dict\nfrom typing import Iterable, Optional\n\nimport numpy as np\nimport torch\nimport torch.nn.functional as F\nfrom torch import Tensor\nfrom torch import nn\n\nfrom .transcribe import transcribe as transcribe_function\nfrom .decoding import detect_language as detect_language_function, decode as decode_function\n\n\n@dataclass\nclass ModelDimensions:\n    n_mels: int\n    n_audio_ctx: int\n    n_audio_state: int\n    n_audio_head: int\n    n_audio_layer: int\n    n_vocab: int\n    n_text_ctx: int\n    n_text_state: int\n    n_text_head: int\n    n_text_layer: int\n\n\nclass LayerNorm(nn.LayerNorm):\n    def forward(self, x: Tensor) -> Tensor:\n        return super().forward(x.float()).type(x.dtype)\n\n\nclass Linear(nn.Linear):\n    def forward(self, x: Tensor) -> Tensor:\n        return F.linear(\n            x, self.weight.to(x.dtype), None if self.bias is None else self.bias.to(x.dtype)\n        )\n\n\nclass Conv1d(nn.Conv1d):\n    def _conv_forward(self, x: Tensor, weight: Tensor, bias: Optional[Tensor]) -> Tensor:\n        return super()._conv_forward(\n            x, weight.to(x.dtype), None if bias is None else bias.to(x.dtype)\n        )\n\n\ndef sinusoids(length, channels, max_timescale=10000):\n    \"\"\"Returns sinusoids for positional embedding\"\"\"\n    assert channels % 2 == 0\n    log_timescale_increment = np.log(max_timescale) / (channels // 2 - 1)\n    inv_timescales = torch.exp(-log_timescale_increment * torch.arange(channels // 2))\n    scaled_time = torch.arange(length)[:, np.newaxis] * inv_timescales[np.newaxis, :]\n    return torch.cat([torch.sin(scaled_time), torch.cos(scaled_time)], dim=1)\n\n\nclass MultiHeadAttention(nn.Module):\n    def __init__(self, n_state: int, n_head: int):\n        super().__init__()\n        self.n_head = n_head\n        self.query = Linear(n_state, n_state)\n        self.key = Linear(n_state, n_state, bias=False)\n        self.value = Linear(n_state, n_state)\n        self.out = Linear(n_state, n_state)\n\n    def forward(\n        self,\n        x: Tensor,\n        xa: Optional[Tensor] = None,\n        mask: Optional[Tensor] = None,\n        kv_cache: Optional[dict] = None,\n    ):\n        q = self.query(x)\n\n        if kv_cache is None or xa is None or self.key not in kv_cache:\n            # hooks, if installed (i.e. kv_cache is not None), will prepend the cached kv tensors;\n            # otherwise, perform key/value projections for self- or cross-attention as usual.\n            k = self.key(x if xa is None else xa)\n            v = self.value(x if xa is None else xa)\n        else:\n            # for cross-attention, calculate keys and values once and reuse in subsequent calls.\n            k = kv_cache[self.key]\n            v = kv_cache[self.value]\n\n        wv = self.qkv_attention(q, k, v, mask)\n        return self.out(wv)\n\n    def qkv_attention(self, q: Tensor, k: Tensor, v: Tensor, mask: Optional[Tensor] = None):\n        n_batch, n_ctx, n_state = q.shape\n        scale = (n_state // self.n_head) ** -0.25\n        q = q.view(*q.shape[:2], self.n_head, -1).permute(0, 2, 1, 3) * scale\n        k = k.view(*k.shape[:2], self.n_head, -1).permute(0, 2, 3, 1) * scale\n        v = v.view(*v.shape[:2], self.n_head, -1).permute(0, 2, 1, 3)\n\n        qk = q @ k\n        if mask is not None:\n            qk = qk + mask[:n_ctx, :n_ctx]\n\n        w = F.softmax(qk.float(), dim=-1).to(q.dtype)\n        return (w @ v).permute(0, 2, 1, 3).flatten(start_dim=2)\n\n\nclass ResidualAttentionBlock(nn.Module):\n    def __init__(self, n_state: int, n_head: int, cross_attention: bool = False):\n        super().__init__()\n\n        self.attn = MultiHeadAttention(n_state, n_head)\n        self.attn_ln = LayerNorm(n_state)\n\n        self.cross_attn = MultiHeadAttention(n_state, n_head) if cross_attention else None\n        self.cross_attn_ln = LayerNorm(n_state) if cross_attention else None\n\n        n_mlp = n_state * 4\n        self.mlp = nn.Sequential(Linear(n_state, n_mlp), nn.GELU(), Linear(n_mlp, n_state))\n        self.mlp_ln = LayerNorm(n_state)\n\n    def forward(\n        self,\n        x: Tensor,\n        xa: Optional[Tensor] = None,\n        mask: Optional[Tensor] = None,\n        kv_cache: Optional[dict] = None,\n    ):\n        x = x + self.attn(self.attn_ln(x), mask=mask, kv_cache=kv_cache)\n        if self.cross_attn:\n            x = x + self.cross_attn(self.cross_attn_ln(x), xa, kv_cache=kv_cache)\n        x = x + self.mlp(self.mlp_ln(x))\n        return x\n\n\nclass AudioEncoder(nn.Module):\n    def __init__(self, n_mels: int, n_ctx: int, n_state: int, n_head: int, n_layer: int):\n        super().__init__()\n        self.conv1 = Conv1d(n_mels, n_state, kernel_size=3, padding=1)\n        self.conv2 = Conv1d(n_state, n_state, kernel_size=3, stride=2, padding=1)\n        self.register_buffer(\"positional_embedding\", sinusoids(n_ctx, n_state))\n\n        self.blocks: Iterable[ResidualAttentionBlock] = nn.ModuleList(\n            [ResidualAttentionBlock(n_state, n_head) for _ in range(n_layer)]\n        )\n        self.ln_post = LayerNorm(n_state)\n\n    def forward(self, x: Tensor):\n        \"\"\"\n        x : torch.Tensor, shape = (batch_size, n_mels, n_ctx)\n            the mel spectrogram of the audio\n        \"\"\"\n        x = F.gelu(self.conv1(x))\n        x = F.gelu(self.conv2(x))\n        x = x.permute(0, 2, 1)\n\n        assert x.shape[1:] == self.positional_embedding.shape, \"incorrect audio shape\"\n        x = (x + self.positional_embedding).to(x.dtype)\n\n        for block in self.blocks:\n            x = block(x)\n\n        x = self.ln_post(x)\n        return x\n\n\nclass TextDecoder(nn.Module):\n    def __init__(self, n_vocab: int, n_ctx: int, n_state: int, n_head: int, n_layer: int):\n        super().__init__()\n\n        self.token_embedding = nn.Embedding(n_vocab, n_state)\n        self.positional_embedding = nn.Parameter(torch.empty(n_ctx, n_state))\n\n        self.blocks: Iterable[ResidualAttentionBlock] = nn.ModuleList(\n            [ResidualAttentionBlock(n_state, n_head, cross_attention=True) for _ in range(n_layer)]\n        )\n        self.ln = LayerNorm(n_state)\n\n        mask = torch.empty(n_ctx, n_ctx).fill_(-np.inf).triu_(1)\n        self.register_buffer(\"mask\", mask, persistent=False)\n\n    def forward(self, x: Tensor, xa: Tensor, kv_cache: Optional[dict] = None):\n        \"\"\"\n        x : torch.LongTensor, shape = (batch_size, <= n_ctx)\n            the text tokens\n        xa : torch.Tensor, shape = (batch_size, n_mels, n_audio_ctx)\n            the encoded audio features to be attended on\n        \"\"\"\n        offset = next(iter(kv_cache.values())).shape[1] if kv_cache else 0\n        x = self.token_embedding(x) + self.positional_embedding[offset : offset + x.shape[-1]]\n        x = x.to(xa.dtype)\n\n        for block in self.blocks:\n            x = block(x, xa, mask=self.mask, kv_cache=kv_cache)\n\n        x = self.ln(x)\n        logits = (x @ torch.transpose(self.token_embedding.weight.to(x.dtype), 0, 1)).float()\n\n        return logits\n\n\nclass Whisper(nn.Module):\n    def __init__(self, dims: ModelDimensions, name=''):\n        super().__init__()\n        self.dims = dims\n        self.encoder = AudioEncoder(\n            self.dims.n_mels,\n            self.dims.n_audio_ctx,\n            self.dims.n_audio_state,\n            self.dims.n_audio_head,\n            self.dims.n_audio_layer,\n        )\n        self.decoder = TextDecoder(\n            self.dims.n_vocab,\n            self.dims.n_text_ctx,\n            self.dims.n_text_state,\n            self.dims.n_text_head,\n            self.dims.n_text_layer,\n        )\n        self.name = name\n\n    def embed_audio(self, mel: torch.Tensor):\n        return self.encoder(mel)\n\n    def logits(self, tokens: torch.Tensor, audio_features: torch.Tensor):\n        return self.decoder(tokens, audio_features)\n\n    def forward(self, mel: torch.Tensor, tokens: torch.Tensor) -> Dict[str, torch.Tensor]:\n        return self.decoder(tokens, self.encoder(mel))\n\n    @property\n    def device(self):\n        return next(self.parameters()).device\n\n    @property\n    def is_multilingual(self):\n        return self.dims.n_vocab == 51865\n\n    def install_kv_cache_hooks(self, cache: Optional[dict] = None):\n        \"\"\"\n        The `MultiHeadAttention` module optionally accepts `kv_cache` which stores the key and value\n        tensors calculated for the previous positions. This method returns a dictionary that stores\n        all caches, and the necessary hooks for the key and value projection modules that save the\n        intermediate tensors to be reused during later calculations.\n\n        Returns\n        -------\n        cache : Dict[nn.Module, torch.Tensor]\n            A dictionary object mapping the key/value projection modules to its cache\n        hooks : List[RemovableHandle]\n            List of PyTorch RemovableHandle objects to stop the hooks to be called\n        \"\"\"\n        cache = {**cache} if cache is not None else {}\n        hooks = []\n\n        def save_to_cache(module, _, output):\n            if module not in cache or output.shape[1] > self.decoder.positional_embedding.shape[0]:\n                cache[module] = output  # save as-is, for the first token or cross attention\n            else:\n                cache[module] = torch.cat([cache[module], output], dim=1).detach()\n            return cache[module]\n\n        def install_hooks(layer: nn.Module):\n            if isinstance(layer, MultiHeadAttention):\n                hooks.append(layer.key.register_forward_hook(save_to_cache))\n                hooks.append(layer.value.register_forward_hook(save_to_cache))\n\n        self.decoder.apply(install_hooks)\n        return cache, hooks\n\n    detect_language = detect_language_function\n    transcribe = transcribe_function\n    decode = decode_function\n"
  },
  {
    "path": "workspaces/Vision_09df18b156814c80a3e1c1ab544423fc/refy_suggestions/test.csv",
    "content": ",id,doi,title,authors,category,source,url,year,score\n0,http://arxiv.org/abs/2305.19280v1,,\"Large language models improve Alzheimer's disease diagnosis using\n  multi-modality data\",\"['Yingjie Feng', 'Jun Wang', 'Xianfeng Gu', 'Xiaoyin Xu', 'Min Zhang']\",,arxiv,http://arxiv.org/abs/2305.19280v1,2021,0.13802443355290694\n1,http://arxiv.org/abs/2305.16222v1,,Incomplete Multimodal Learning for Complex Brain Disorders Prediction,\"['Reza Shirkavand', 'Liang Zhan', 'Heng Huang', 'Li Shen', 'Paul M. Thompson']\",,arxiv,http://arxiv.org/abs/2305.16222v1,2021,0.13695037693570716\n2,http://arxiv.org/abs/2305.10502v1,,\"EENED: End-to-End Neural Epilepsy Detection based on Convolutional\n  Transformer\",\"['Chenyu Liu', 'Xinliang Zhou', 'Yang Liu']\",,arxiv,http://arxiv.org/abs/2305.10502v1,2021,0.12467641282565606\n3,http://arxiv.org/abs/2305.17235v2,,\"COMCAT: Towards Efficient Compression and Customization of\n  Attention-Based Vision Models\",\"['Jinqi Xiao', 'Miao Yin', 'Yu Gong', 'Xiao Zang', 'Jian Ren', 'Bo Yuan']\",,arxiv,http://arxiv.org/abs/2305.17235v2,2021,0.12355122484233379\n4,http://arxiv.org/abs/2306.03022v1,,\"Interpretable Alzheimer's Disease Classification Via a Contrastive\n  Diffusion Autoencoder\",\"['Ayodeji Ijishakin', 'Ahmed Abdulaal', 'Adamos Hadjivasiliou', 'Sophie Martin', 'James Cole']\",,arxiv,http://arxiv.org/abs/2306.03022v1,2021,0.1234453840648734\n5,http://arxiv.org/abs/2305.10060v1,,XAI for Self-supervised Clustering of Wireless Spectrum Activity,\"['Ljupcho Milosheski', 'Gregor Cerar', 'Blaž Bertalanič', 'Carolina Fortuna', 'Mihael Mohorčič']\",,arxiv,http://arxiv.org/abs/2305.10060v1,2021,0.10784067817323988\n6,http://arxiv.org/abs/2306.05514v1,,\"Robust Brain Age Estimation via Regression Models and MRI-derived\n  Features\",\"['Mansoor Ahmed', 'Usama Sardar', 'Sarwan Ali', 'Shafiq Alam', 'Murray Patterson', 'Imdad Ullah Khan']\",,arxiv,http://arxiv.org/abs/2306.05514v1,2021,0.1006547027490279\n7,http://arxiv.org/abs/2305.11560v1,,Brain Captioning: Decoding human brain activity into images and text,\"['Matteo Ferrante', 'Furkan Ozcelik', 'Tommaso Boccato', 'Rufin VanRullen', 'Nicola Toschi']\",,arxiv,http://arxiv.org/abs/2305.11560v1,2021,0.09841263998032185\n8,10.1101/2023.05.27.542592,10.1101/2023.05.27.542592,An Explainable and Robust Deep Learning Approach for Automated Electroencephalography-based Schizophrenia Diagnosis,\"Sattiraju, A.; Ellis, C. A.; Miller, R. L.; Calhoun, V. D.\",neuroscience,biorxiv,,2023,0.09216368910512815\n9,http://arxiv.org/abs/2306.00294v1,,\"Affinity-based Attention in Self-supervised Transformers Predicts\n  Dynamics of Object Grouping in Humans\",\"['Hossein Adeli', 'Seoyoung Ahn', 'Nikolaus Kriegeskorte', 'Gregory Zelinsky']\",,arxiv,http://arxiv.org/abs/2306.00294v1,2021,0.0920963415857158\n10,http://arxiv.org/abs/2305.16657v1,,\"Higher Order Gauge Equivariant CNNs on Riemannian Manifolds and\n  Applications\",\"['Gianfranco Cortes', 'Yue Yu', 'Robin Chen', 'Melissa Armstrong', 'David Vaillancourt', 'Baba C. Vemuri']\",,arxiv,http://arxiv.org/abs/2305.16657v1,2021,0.09058977336216062\n11,http://arxiv.org/abs/2306.00989v1,,Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles,\"['Chaitanya Ryali', 'Yuan-Ting Hu', 'Daniel Bolya', 'Chen Wei', 'Haoqi Fan', 'Po-Yao Huang', 'Vaibhav Aggarwal', 'Arkabandhu Chowdhury', 'Omid Poursaeed', 'Judy Hoffman', 'Jitendra Malik', 'Yanghao Li', 'Christoph Feichtenhofer']\",,arxiv,http://arxiv.org/abs/2306.00989v1,2021,0.09058950611280858\n12,10.1101/2023.05.30.542072,10.1101/2023.05.30.542072,The Bergen Breakfast Scanning Club dataset: a deep brain imaging dataset,\"Wang, M.-Y.; Korbmacher, M.; Eikeland, R.; Specht, K.\",neuroscience,biorxiv,,2023,0.08871953085977524\n13,http://arxiv.org/abs/2306.05745v1,,Two Independent Teachers are Better Role Model,\"['Afifa Khaled', 'Ahmed A. Mubarak', 'Kun He']\",,arxiv,http://arxiv.org/abs/2306.05745v1,2021,0.08750304025011657\n14,http://arxiv.org/abs/2305.16554v1,,Emergent Agentic Transformer from Chain of Hindsight Experience,\"['Hao Liu', 'Pieter Abbeel']\",,arxiv,http://arxiv.org/abs/2305.16554v1,2021,0.08687888341024813\n15,10.1101/2023.06.08.544050,10.1101/2023.06.08.544050,3D analysis of dissection photographs with surface scanning and machine learning for quantitative neuropathology,\"Gazula, H.; Tregidgo, H.; Billot, B.; Balbastre, Y.; Williams-Ramirez, J.; Herisse, R.; Casamitjana, A.; Melief, E. J.; Latimer, C. S.; Kilgore, M. D.; Montine, M.; Robinson, E.; Blackburn, E.; Marshall, M. S.; Connors, T. R.; Oakley, D. H.; Frosh, M. P.; Van Leemput, K.; Dalca, A. V.; Fischl, B.; Mac Donald, C. L.; Keene, C. D.; Hyman, B.; Iglesias, J. E.\",neuroscience,biorxiv,,2023,0.08621946712332713\n16,http://arxiv.org/abs/2306.03834v1,,\"MTS2Graph: Interpretable Multivariate Time Series Classification with\n  Temporal Evolving Graphs\",\"['Raneen Younis', 'Abdul Hakmeh', 'Zahra Ahmadi']\",,arxiv,http://arxiv.org/abs/2306.03834v1,2021,0.08515588285487911\n17,http://arxiv.org/abs/2306.04655v1,,\"Modulation Classification Through Deep Learning Using Resolution\n  Transformed Spectrograms\",\"['Muhammad Waqas', 'Muhammad Ashraf', 'Muhammad Zakwan']\",,arxiv,http://arxiv.org/abs/2306.04655v1,2021,0.0847741042619827\n18,http://arxiv.org/abs/2306.03284v1,,\"Optimizing Sampling Patterns for Compressed Sensing MRI with Diffusion\n  Generative Models\",\"['Sriram Ravula', 'Brett Levac', 'Ajil Jalal', 'Jonathan I. Tamir', 'Alexandros G. Dimakis']\",,arxiv,http://arxiv.org/abs/2306.03284v1,2021,0.08397880868757927\n19,http://arxiv.org/abs/2306.05029v1,,\"Multi-level Multiple Instance Learning with Transformer for Whole Slide\n  Image Classification\",\"['Ruijie Zhang', 'Qiaozhe Zhang', 'Yingzhuang Liu', 'Hao Xin', 'Yan Liu', 'Xinggang Wang']\",,arxiv,http://arxiv.org/abs/2306.05029v1,2021,0.08395868822917545\n"
  },
  {
    "path": "workspaces/Vision_09df18b156814c80a3e1c1ab544423fc/refy_suggestions/test.html",
    "content": "<!DOCTYPE html>\n<head>\n<meta charset=\"UTF-8\">\n<style>\n.r1 {color: #ffa726; text-decoration-color: #ffa726}\n.r2 {font-weight: bold}\n.r3 {color: #66bb6a; text-decoration-color: #66bb6a; font-weight: bold}\n.r4 {color: #ff7043; text-decoration-color: #ff7043; font-weight: bold}\n.r5 {color: #ff7043; text-decoration-color: #ff7043; font-weight: bold; text-decoration: underline}\n.r6 {color: #ec407a; text-decoration-color: #ec407a}\n.r7 {color: #ec407a; text-decoration-color: #ec407a; font-weight: bold}\n.r8 {color: #81d4fa; text-decoration-color: #81d4fa; text-decoration: underline}\n.r9 {color: #fb8c00; text-decoration-color: #fb8c00}\n.r10 {color: #0f0f0f; text-decoration-color: #0f0f0f}\n.r11 {color: #8e7423; text-decoration-color: #8e7423}\n.r12 {color: #9ccc65; text-decoration-color: #9ccc65}\n.r13 {color: #5d7541; text-decoration-color: #5d7541}\n.r14 {color: #ffa726; text-decoration-color: #ffa726; font-weight: bold}\n.r15 {color: #81d4fa; text-decoration-color: #81d4fa; font-weight: bold}\n.r16 {color: #858687; text-decoration-color: #858687}\n.r17 {color: #6f6f6f; text-decoration-color: #6f6f6f}\nbody {\n    color: #000000;\n    background-color: #1e1e1e;\n}\n</style>\n</head>\n<html>\n<body>\n    <pre style=\"font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n        <code><span class=\"r1\">╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮</span>\n<span class=\"r1\">│</span>  <span class=\"r2\">                                                                                                                                                          </span>  <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r1\">📆  Daily suggestions for: </span><span class=\"r3\">2023-06-12</span>                                                                                                                      <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>                                                                                                                                                              <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>                                                                                                                                                              <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r4\">🔍  </span><span class=\"r5\">keywords</span>                                                                                                                                               <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>                                                                                                                                                              <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>                                                                                                                                                              <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>    <span class=\"r6\">    </span><span class=\"r7\">1.</span><span class=\"r6\"> </span><span class=\"r8\">model</span>  <span class=\"r6\">    </span><span class=\"r7\">2.</span><span class=\"r6\"> </span><span class=\"r8\">vit</span>  <span class=\"r6\">    </span><span class=\"r7\">3.</span><span class=\"r6\"> </span><span class=\"r8\">models</span>  <span class=\"r6\">    </span><span class=\"r7\">4.</span><span class=\"r6\"> </span><span class=\"r8\">images</span>  <span class=\"r6\">    </span><span class=\"r7\">5.</span><span class=\"r6\"> </span><span class=\"r8\">cnn</span>  <span class=\"r6\">    </span><span class=\"r7\">6.</span><span class=\"r6\"> </span><span class=\"r8\">transformer</span>  <span class=\"r6\">    </span><span class=\"r7\">7.</span><span class=\"r6\"> </span><span class=\"r8\">learning</span>  <span class=\"r6\">    </span><span class=\"r7\">8.</span><span class=\"r6\"> </span><span class=\"r8\">image</span>                                     <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>                                                                                                                                                              <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r9\">────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────</span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>                                                                                                                                                              <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r7\"> #   year                  author  title                                                                             DOI                        source  </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 1  </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">    Yingjie Feng</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> Large language</span><span class=\"r15\"> models </span><span class=\"r14\">improve Alzheimer&#x27;s disease diagnosis using                </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2305.19280v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">   multi-modality data                                                            </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 2  </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\"> Reza Shirkavand</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> Incomplete Multimodal Learning for Complex Brain Disorders Prediction            </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2305.16222v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 3  </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">      Chenyu Liu</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> EENED: End-to-End Neural Epilepsy Detection based on Convolutional               </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2305.10502v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">   Transformer                                                                    </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 4  </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">      Jinqi Xiao</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> COMCAT: Towards Efficient Compression and Customization of                       </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2305.17235v2\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">   Attention-Based Vision Models                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 5  </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">   Ayodeji Ijishakin</span><span class=\"r13\"> et</span><span class=\"r12\"> </span><span class=\"r14\"> Interpretable Alzheimer&#x27;s Disease Classification Via a Contrastive               </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2306.03022v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                    </span><span class=\"r13\">al.</span><span class=\"r12\"> </span><span class=\"r14\">   Diffusion Autoencoder                                                          </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 6  </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">  Ljupcho Milosheski</span><span class=\"r13\"> et</span><span class=\"r12\"> </span><span class=\"r14\"> XAI for Self-supervised Clustering of Wireless Spectrum Activity                 </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2305.10060v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                    </span><span class=\"r13\">al.</span><span class=\"r12\"> </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 7  </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">   Mansoor Ahmed</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> Robust Brain Age Estimation via Regression Models and MRI-derived                </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2306.05514v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">   Features                                                                       </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 8  </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\"> Matteo Ferrante</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> Brain Captioning: Decoding human brain activity into</span><span class=\"r15\"> images </span><span class=\"r14\">and text             </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2305.11560v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 9  </span><span class=\"r10\"> </span><span class=\"r11\">2023</span><span class=\"r10\"> </span><span class=\"r12\">   Sattiraju, A.</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> An Explainable and Robust Deep Learning Approach for Automated                   </span><span class=\"r10\"> </span><a class=\"r16\" href=\"https://doi.org/10.1101/2023.05.27.542592\">10.1101/2023.05.27.542592</a><span class=\"r10\">  </span><span class=\"r17\">biorxiv</span><span class=\"r10\"> </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\"> Electroencephalography-based Schizophrenia Diagnosis                             </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 10 </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">   Hossein Adeli</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> Affinity-based Attention in Self-supervised Transformers Predicts                </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2306.00294v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">   Dynamics of Object Grouping in Humans                                          </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 11 </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">   Gianfranco Cortes</span><span class=\"r13\"> et</span><span class=\"r12\"> </span><span class=\"r14\"> Higher Order Gauge Equivariant CNNs on Riemannian Manifolds and                  </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2305.16657v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                    </span><span class=\"r13\">al.</span><span class=\"r12\"> </span><span class=\"r14\">   Applications                                                                   </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 12 </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\"> Chaitanya Ryali</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles          </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2306.00989v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 13 </span><span class=\"r10\"> </span><span class=\"r11\">2023</span><span class=\"r10\"> </span><span class=\"r12\">     Wang, M.-Y.</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> The Bergen Breakfast Scanning Club dataset: a deep brain imaging dataset         </span><span class=\"r10\"> </span><a class=\"r16\" href=\"https://doi.org/10.1101/2023.05.30.542072\">10.1101/2023.05.30.542072</a><span class=\"r10\">  </span><span class=\"r17\">biorxiv</span><span class=\"r10\"> </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 14 </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">    Afifa Khaled</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> Two Independent Teachers are Better Role Model                                   </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2306.05745v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 15 </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">         Hao Liu</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> Emergent Agentic Transformer from Chain of Hindsight Experience                  </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2305.16554v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 16 </span><span class=\"r10\"> </span><span class=\"r11\">2023</span><span class=\"r10\"> </span><span class=\"r12\">      Gazula, H.</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> 3D analysis of dissection photographs with surface scanning and machine</span><span class=\"r15\"> learning</span><span class=\"r14\"> </span><span class=\"r10\"> </span><a class=\"r16\" href=\"https://doi.org/10.1101/2023.06.08.544050\">10.1101/2023.06.08.544050</a><span class=\"r10\">  </span><span class=\"r17\">biorxiv</span><span class=\"r10\"> </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\"> for quantitative neuropathology                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 17 </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">   Raneen Younis</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> MTS2Graph: Interpretable Multivariate Time Series Classification with            </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2306.03834v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">   Temporal Evolving Graphs                                                       </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 18 </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">  Muhammad Waqas</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> Modulation Classification Through Deep Learning Using Resolution                 </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2306.04655v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">   Transformed Spectrograms                                                       </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 19 </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">   Sriram Ravula</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> Optimizing Sampling Patterns for Compressed Sensing MRI with Diffusion           </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2306.03284v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">   Generative Models                                                              </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\"> 20 </span><span class=\"r10\"> </span><span class=\"r11\">2021</span><span class=\"r10\"> </span><span class=\"r12\">    Ruijie Zhang</span><span class=\"r13\"> et al.</span><span class=\"r12\"> </span><span class=\"r14\"> Multi-level Multiple Instance Learning with Transformer for Whole Slide          </span><span class=\"r10\"> </span><a class=\"r16\" href=\"http://arxiv.org/abs/2306.05029v1\">http://arxiv.org/...</a><span class=\"r10\">       </span><span class=\"r17\">arxiv</span><span class=\"r10\">   </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">   Image Classification                                                           </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r6\">    </span><span class=\"r10\">      </span><span class=\"r12\">                        </span><span class=\"r14\">                                                                                  </span><span class=\"r10\">                                    </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>   <span class=\"r10\">                                                           </span><span class=\"r16\">20 papers, recommended by refy 👌</span><span class=\"r10\">                                                            </span>   <span class=\"r1\">│</span>\n<span class=\"r1\">│</span>                                                                                                                                                              <span class=\"r1\">│</span>\n<span class=\"r1\">╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯</span>\n</code>\n    </pre>\n</body>\n</html>\n"
  },
  {
    "path": "workspaces/Vision_09df18b156814c80a3e1c1ab544423fc/results/papers.bib",
    "content": "@ARTICLE{6189435c78b9833da2061c3088e04b8aa5c034e8,\n  title     = \"Trans-ResNet: Integrating Transformers and CNNs for Alzheimer’s disease classification\",\n  author    = \"Chao Li\",\n  abstract  = \"Convolutional neural networks  CNNs  have demonstrated excellent performance for brain disease classification from MRI data  However  CNNs lack the ability to capture global dependencies  The recently proposed architecture called Transformer uses attention mechanisms to match or even outperform CNNs on various vision tasks  Transformer s performance is dependent on access to large training datasets  but sample sizes for most brain MRI datasets are relatively small  To overcome this limitation  we propose Trans ResNet  a novel architecture which integrates the advantages of both CNNs and Transformers  In addition  we pre trained our Trans ResNet on a large scale dataset on the task of brain age estimation for higher performance  Using three neuroimaging cohorts  UK Biobank  AIBL  ADNI   we demonstrated that our Trans ResNet achieved higher classification accuracy on Alzheimer disease prediction compared to other state of the art CNN based methods \",\n  year      = 2022,\n  journal   = \"2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI)\",\n  pages     = \"1-5\"\n}\n\n@ARTICLE{eaf2d681c93c17864146394455068817f6faa69c,\n  title     = \"Efficiently Training Vision Transformers on Structural MRI Scans for Alzheimer's Disease Detection\",\n  author    = \"N. Dhinagar\",\n  abstract  = \"Neuroimaging of large populations is valuable to identify factors that promote or resist brain disease  and to assist diagnosis  subtyping  and prognosis  Data driven models such as convolutional neural networks  CNNs  have increasingly been applied to brain images to perform diagnostic and prognostic tasks by learning robust features  Vision transformers  ViT    a new class of deep learning architectures   have emerged in recent years as an alternative to CNNs for several computer vision applications  Here we tested variants of the ViT architecture for a range of desired neuroimaging downstream tasks based on difficulty  in this case for sex and Alzheimer s disease  AD  classification based on 3D brain MRI  In our experiments  two vision transformer architecture variants achieved an AUC of 0 987 for sex and 0 892 for AD classification  respectively  We independently evaluated our models on data from two benchmark AD datasets  We achieved a performance boost of 5  and 9 10  upon fine tuning vision transformer models pre trained on synthetic  generated by a latent diffusion model  and real MRI scans  respectively  Our main contributions include testing the effects of different ViT training strategies including pre training  data augmentation and learning rate warm ups followed by annealing  as pertaining to the neuroimaging domain  These techniques are essential for training ViT like models for neuroimaging applications where training data is usually limited  We also analyzed the effect of the amount of training data utilized on the test time performance of the ViT via data model scaling curves \",\n  year      = 2023,\n  journal   = \"ArXiv\",\n  pages     = \"\"\n}\n\n@ARTICLE{7b2cf4599fdbfa7bb073ce3737e6350c8a7d8c3d,\n  title     = \"Vision Transformer Approach for Classification of Alzheimer’s Disease Using 18F-Florbetaben Brain Images\",\n  author    = \"Hyunji Shin\",\n  abstract  = \"Dementia is a degenerative disease that is increasingly prevalent in an aging society  Alzheimer s disease  AD   the most common type of dementia  is best mitigated via early detection and management  Deep learning is an artificial intelligence technique that has been used to diagnose and predict diseases by extracting meaningful features from medical images  The convolutional neural network  CNN  is a representative application of deep learning  serving as a powerful tool for the diagnosis of AD  Recently  vision transformers  ViT  have yielded classification performance exceeding that of CNN in some diagnostic image classifications  Because the brain is a very complex network with interrelated regions  ViT  which captures direct relationships between images  may be more effective for brain image analysis than CNN  Therefore  we propose a method for classifying dementia images by applying 18F Florbetaben positron emission tomography  PET  images to ViT  Data were evaluated via binary  normal control and abnormal  and ternary  healthy control  mild cognitive impairment  and AD  classification  In a performance comparison with the CNN  VGG19 was selected as the comparison model  Consequently  ViT yielded more effective performance than VGG19 in binary classification  However  in ternary classification  the performance of ViT cannot be considered excellent  These results show that it is hard to argue that the ViT model is better at AD classification than the CNN model \",\n  year      = 2023,\n  journal   = \"Applied Sciences\",\n  pages     = \"\"\n}\n\n@ARTICLE{9f3cd0b46507d50caf71101f1dc3b23e339a8ba0,\n  title     = \"Comparative Analysis of Alzheimer's Disease Detection via MRI Scans Using Convolutional Neural Network and Vision Transformer\",\n  author    = \"Pinky Sherwani\",\n  abstract  = \"Progressive damage to brain neurons is caused by a neurodegenerative disease  ND  that the body cannot heal or restore  Dementia like Alzheimer s Disease  AD   which affects millions of lives each year  is a well known instance of such illnesses  Despite extensive research  the aforementioned disorders currently have no viable therapies  However  a timely diagnosis is essential for the management of diseases  For neurologists  diagnosing NDs is difficult and needs years of education and experience  The paper focuses on the detection of Alzheimer s Disease via Brain MRI Scans using Convolutional Neural Network  CNN  which minimizes an image s high dimensionality without sacrificing its information and using Vision Transformers that are for image classification and apply an architecture akin to a Transformer over selected areas of the image  In the proposed work three different Vision Transformer namely  Vanilla Vision Transformer  Vanilla ViT   Deep Vision Transformer  DeepViT   Class Attention in Image Transformer  CaiT   The Vision Transformers turned out to be better than CNN as when training on fewer datasets  ViT exhibits inductive bias  which increases dependency on model regularisation or data augmentation  AgReg   In terms of accuracy and computing efficiency  ViT models exceed the present CNN by almost a factor of four \",\n  year      = 2023,\n  journal   = \"2023 International Conference on Artificial Intelligence and Knowledge Discovery in Concurrent Engineering (ICECONF)\",\n  pages     = \"1-9\"\n}\n\n@ARTICLE{6e6ee835b7388d52e2df688733a2952df38e13f6,\n  title     = \"Introducing Vision Transformer for Alzheimer's Disease classification task with 3D input\",\n  author    = \"Zilun Zhang\",\n  abstract  = \"  Many high performance classification models utilize complex CNN based architectures for Alzheimer s Disease classification  We aim to investigate two relevant questions regarding classification of Alzheimer s Disease using MRI   Do Vision Transformer based models perform better than CNN based models   and  Is it possible to use a shallow 3D CNN based model to obtain satisfying results   To achieve these goals  we propose two models that can take in and process 3D MRI scans  Convolutional Voxel Vision Transformer  CVVT  architecture  and ConvNet3D 4  a shallow 4 block 3D CNN based model  Our results indicate that the shallow 3D CNN based models are sufficient to achieve good classification results for Alzheimer s Disease using MRI scans \",\n  year      = 2022,\n  journal   = \"ArXiv\",\n  pages     = \"\"\n}\n\n@ARTICLE{64023908cd9171ded5394d093f9cddb87acc0b59,\n  title     = \"Vision transformers for the prediction of mild cognitive impairment to Alzheimer’s disease progression using mid-sagittal sMRI\",\n  author    = \"Gia-Minh Hoang\",\n  abstract  = \"Background Alzheimer s disease  AD  is one of the most common causes of neurodegenerative disease affecting over 50 million people worldwide  However  most AD diagnosis occurs in the moderate to late stage  which means that the optimal time for treatment has already passed  Mild cognitive impairment  MCI  is an intermediate state between cognitively normal people and AD patients  Therefore  the accurate prediction in the conversion process of MCI to AD may allow patients to start preventive intervention to slow the progression of the disease  Nowadays  neuroimaging techniques have been developed and are used to determine AD related structural biomarkers  Deep learning approaches have rapidly become a key methodology applied to these techniques to find biomarkers  Methods In this study  we aimed to investigate an MCI to AD prediction method using Vision Transformers  ViT  to structural magnetic resonance images  sMRI   The Alzheimer s Disease Neuroimaging Initiative  ADNI  database containing 598 MCI subjects was used to predict MCI subjects  progression to AD  There are three main objectives in our study   i  to propose an MRI based Vision Transformers approach for MCI to AD progression classification   ii  to evaluate the performance of different ViT architectures to obtain the most advisable one  and  iii  to visualize the brain region mostly affect the prediction of deep learning approach to MCI progression  Results Our method achieved state of the art classification performance in terms of accuracy  83 27    specificity  85 07    and sensitivity  81 48   compared with a set of conventional methods  Next  we visualized the brain regions that mostly contribute to the prediction of MCI progression for interpretability of the proposed model  The discriminative pathological locations include the thalamus  medial frontal  and occipital corroborating the reliability of our model  Conclusion In conclusion  our methods provide an effective and accurate technique for the prediction of MCI conversion to AD  The results obtained in this study outperform previous reports using the ADNI collection  and it suggests that sMRI based ViT could be efficiently applied with a considerable potential benefit for AD patient management  The brain regions mostly contributing to prediction  in conjunction with the identified anatomical features  will support the building of a robust solution for other neurodegenerative diseases in future \",\n  year      = 2023,\n  journal   = \"Frontiers in Aging Neuroscience\",\n  pages     = \"\"\n}\n\n@ARTICLE{1b4ac29a08b2272bd6b3fbf293544816e9d31c5a,\n  title     = \"SMIL-DeiT:Multiple Instance Learning and Self-supervised Vision Transformer network for Early Alzheimer's disease classification\",\n  author    = \"Yue Yin\",\n  abstract  = \"Early diagnosis of Alzheimer s disease AD  is becoming increasingly important in preventing and treating the disease as the world s population ages  We proposed a SMIL DeiT network for AD classification tasks amongst three groups  Alzheimer s Disease  AD   Mild Cognitive Impairment  MCI   and Normal Cognitive  NC  in this study  Vision Transformer is the fundamental structure of our work  The data pre training is performed utilizing DINO  a self supervised technique  whereas the downstream classification task is done with Multiple Instance Learning  Our proposed technique works on the ADNI dataset  We used four performance metrics accuracy rates  precision  recall  and Fl score in the evaluation  the most important of which was accuracy  The accuracy obtained by our method is higher than the transformer s 90 1  and CNN s 90 8   reaching 93 2  \",\n  year      = 2022,\n  journal   = \"2022 International Joint Conference on Neural Networks (IJCNN)\",\n  pages     = \"1-6\"\n}\n\n@ARTICLE{9da3fadf092c864f61d6fd1e8eab5a6ca2397194,\n  title     = \"Classification of Alzheimer's Disease via Vision Transformer: Classification of Alzheimer's Disease via Vision Transformer\",\n  author    = \"Yanjun Lyu\",\n  abstract  = \"Deep models are powerful in capturing the complex and non linear relationship buried in brain imaging data  However  the huge number of parameters in deep models can easily overfit given limited imaging data samples  In this work  we proposed a cross domain transfer learning method to solve the insufficient data problem in brain imaging domain by leveraging the knowledge learned in natural image domain  Specifically  we employed ViT as the backbone and firstly pretrained it using ImageNet 21K dataset and then transferred to the brain imaging dataset  A slice wise convolution embedding method was developed to improve the standard patch operation in vanilla ViT  Our method was evaluated based on AD CN classification task  We also conducted extensive experiments to compare the transfer performance with different transfer strategies  models  and sample size  The results suggest that the proposed method can effectively transfer the knowledge learned in natural image domain to brain imaging area and may provide a promising way to take advantages of the pretrained model in data intensive applications  Moreover  the proposed cross domain transfer learning method can obtain comparable classification performance compared to most recent studies \",\n  year      = 2022,\n  journal   = \"Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments\",\n  pages     = \"\"\n}\n\n@ARTICLE{adb781ca8eeea0f2cb4422b3fd632c0fd4ee2448,\n  title     = \"Advit: Vision Transformer On Multi-Modality Pet Images For Alzheimer Disease Diagnosis\",\n  author    = \"Xin Xing\",\n  abstract  = \"We present a new model trained on multi modalities of Positron Emission Tomography images  PET AV45 and PET FDG  for Alzheimer s Disease  AD  diagnosis  Unlike the conventional methods using multi modal 3D 2D CNN architecture  our design replaces the Convolutional Neural Net work  CNN  by Vision Transformer  ViT   Considering the high computation cost of 3D images  we firstly employ a 3D to 2D operation to project the 3D PET images into 2D fusion images  Then  we forward the fused multi modal 2D images to a parallel ViT model for feature extraction  followed by classification for AD diagnosis  For evaluation  we use PET images from ADNI  The proposed model outperforms several strong baseline models in our experiments and achieves 0 91 accuracy and 0 95 AUC \",\n  year      = 2022,\n  journal   = \"2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI)\",\n  pages     = \"1-4\"\n}\n\n@ARTICLE{0e681598a83e3e55cfbc34a07a222e99e9f4c445,\n  title     = \"CrossViT Wide Residual Squeeze-and-Excitation Network for Alzheimer's disease classification with self attention ProGAN data augmentation\",\n  author    = \"Rahma Kadri\",\n  abstract  = \"Efficient and accurate early prediction of Alzheimer s disease  AD  based on the neuroimaging data has attracted interest from many researchers to prevent its progression  Deep learning networks have demonstrated an optimal ability to analyse large scale multimodal neuroimaging for AD classification  The most widely used architecture of deep learning is the Convolution neural networks  CNN  that have shown great potential in AD detection  However CNN does not capture long range dependencies within the input image and does not ensure a good global feature extraction  Furthermore  increasing the receptive field of CNN by increasing the kernels sizes can cause a feature granularity loss  Another limitation is that CNN lacks a weighing mechanism of image features  the network doesn t focus on the relevant features within the image  Recently vision transformer have shown an outstanding performance over the CNN and overcomes its main limitations  The vision transformer relies on the self attention layers  The main drawbacks of this new technique is that it requires a huge amount of training data  In this paper  we combined the main strengths of these two architectures for AD classification  We proposed a new method based on the combination of the Cross ViT and Wide Residual Squeeze and Excitation Network  We acquired MRI data from the Alzheimer s Disease Neuroimaging Initiative  ADNI  and the Open Access Series of Imaging Studies  OASIS   We also proposed a new data augmentation based on the self attention progressive generative adversarial neural network to overcome the limitation of the data  Our proposed method achieved 99  classification accuracy and outperforms CNN models \",\n  year      = 2022,\n  journal   = \"Int. J. Hybrid Intell. Syst.\",\n  pages     = \"163-177\"\n}\n\n@ARTICLE{af73f6878044012034023640918cf31d7f4e1522,\n  title     = \"Comparison of Anatomical and Diffusion MRI for detecting Parkinson’s Disease using Deep Convolutional Neural Network\",\n  author    = \"Tamoghna Chattopadhyay\",\n  abstract  = \"Parkinson s disease  PD  is a progressive neurodegenerative disease that affects over 10 million people worldwide  Brain atrophy and microstructural abnormalities tend to be more subtle in PD than in other age related conditions such as Alzheimer s disease  so there is interest in how well machine learning methods can detect PD in radiological scans  Deep learning models based on convolutional neural networks  CNNs  can automatically distil diagnostically useful features from raw MRI scans  but most CNN based deep learning models have only been tested on T1 weighted brain MRI  Here we examine the added value of diffusion weighted MRI  dMRI    a variant of MRI  sensitive to microstructural tissue properties   as an additional input in CNN based models for PD classification  Our evaluations used data from 3 separate cohorts   from Chang Gung University  the University of Pennsylvania  and the PPMI dataset  We trained CNNs on various combinations of these cohorts to find the best predictive model  Although tests on more diverse data are warranted  deep learned models from dMRI show promise for PD classification  Clinical Relevance This study supports the use of diffusion weighted images as an alternative to anatomical images for AI based detection of Parkinson s disease \",\n  year      = 2023,\n  journal   = \"bioRxiv\",\n  pages     = \"\"\n}\n\n@ARTICLE{003f528634735ece46286609f6a074fbfa435722,\n  title     = \"Pre-trained deep learning models for brain MRI image classification\",\n  author    = \"Srigiri Krishnapriya\",\n  abstract  = \"Brain tumors are serious conditions caused by uncontrolled and abnormal cell division  Tumors can have devastating implications if not accurately and promptly detected  Magnetic resonance imaging  MRI  is one of the methods frequently used to detect brain tumors owing to its excellent resolution  In the past few decades  substantial research has been conducted in the field of classifying brain images  ranging from traditional methods to deep learning techniques such as convolutional neural networks  CNN   To accomplish classification  machine learning methods require manually created features  In contrast  CNN achieves classification by extracting visual features from unprocessed images  The size of the training dataset had a significant impact on the features that CNN extracts  The CNN tends to overfit when its size is small  Deep CNNs  DCNN  with transfer learning have therefore been developed  The aim of this work was to investigate the brain MR image categorization potential of pre trained DCNN VGG 19  VGG 16  ResNet50  and Inception V3 models using data augmentation and transfer learning techniques  Validation of the test set utilizing accuracy  recall  Precision  and F1 score showed that the pre trained VGG 19 model with transfer learning exhibited the best performance  In addition  these methods offer an end to end classification of raw images without the need for manual attribute extraction \",\n  year      = 2023,\n  journal   = \"Frontiers in Human Neuroscience\",\n  pages     = \"\"\n}\n\n"
  },
  {
    "path": "workspaces/Vision_09df18b156814c80a3e1c1ab544423fc/results/papers.csv",
    "content": "paperId,title,first_author,year,abstract,tldr,bibtex,influentialCitationCount,venue,journal,pages\r\r\n6189435c78b9833da2061c3088e04b8aa5c034e8,Trans-ResNet: Integrating Transformers and CNNs for Alzheimer’s disease classification,Chao Li,2022,\"Convolutional neural networks (CNNs) have demonstrated excellent performance for brain disease classification from MRI data. However, CNNs lack the ability to capture global dependencies. The recently proposed architecture called Transformer uses attention mechanisms to match or even outperform CNNs on various vision tasks. Transformer’s performance is dependent on access to large training datasets, but sample sizes for most brain MRI datasets are relatively small. To overcome this limitation, we propose Trans-ResNet, a novel architecture which integrates the advantages of both CNNs and Transformers. In addition, we pre-trained our Trans-ResNet on a large-scale dataset on the task of brain age estimation for higher performance. Using three neuroimaging cohorts (UK Biobank, AIBL, ADNI), we demonstrated that our Trans-ResNet achieved higher classification accuracy on Alzheimer disease prediction compared to other state-of-the-art CNN-based methods.\",\"Convolutional neural networks (CNNs) have demonstrated excellent performance for brain disease classification from MRI data. However, CNNs lack the ability to capture global dependencies. The recently proposed architecture called Transformer uses attention mechanisms to match or even outperform CNNs on various vision tasks. Transformer’s performance is dependent on access to large training datasets, but sample sizes for most brain MRI datasets are relatively small. To overcome this limitation, we propose Trans-ResNet, a novel architecture which integrates the advantages of both CNNs and Transformers. In addition, we pre-trained our Trans-ResNet on a large-scale dataset on the task of brain age estimation for higher performance. Using three neuroimaging cohorts (UK Biobank, AIBL, ADNI), we demonstrated that our Trans-ResNet achieved higher classification accuracy on Alzheimer disease prediction compared to other state-of-the-art CNN-based methods.\",\"@Article{Li2022TransResNetIT,\r\n author = {Chao Li and Yue Cui and Na Luo and Yong Liu and P. Bourgeat and J. Fripp and Tianzi Jiang},\r\n booktitle = {IEEE International Symposium on Biomedical Imaging},\r\n journal = {2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI)},\r\n pages = {1-5},\r\n title = {Trans-ResNet: Integrating Transformers and CNNs for Alzheimer’s disease classification},\r\n year = {2022}\r\n}\r\n\",0,IEEE International Symposium on Biomedical Imaging,2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI),1-5\r\r\neaf2d681c93c17864146394455068817f6faa69c,Efficiently Training Vision Transformers on Structural MRI Scans for Alzheimer's Disease Detection,N. Dhinagar,2023,\"Neuroimaging of large populations is valuable to identify factors that promote or resist brain disease, and to assist diagnosis, subtyping, and prognosis. Data-driven models such as convolutional neural networks (CNNs) have increasingly been applied to brain images to perform diagnostic and prognostic tasks by learning robust features. Vision transformers (ViT) - a new class of deep learning architectures - have emerged in recent years as an alternative to CNNs for several computer vision applications. Here we tested variants of the ViT architecture for a range of desired neuroimaging downstream tasks based on difficulty, in this case for sex and Alzheimer's disease (AD) classification based on 3D brain MRI. In our experiments, two vision transformer architecture variants achieved an AUC of 0.987 for sex and 0.892 for AD classification, respectively. We independently evaluated our models on data from two benchmark AD datasets. We achieved a performance boost of 5% and 9-10% upon fine-tuning vision transformer models pre-trained on synthetic (generated by a latent diffusion model) and real MRI scans, respectively. Our main contributions include testing the effects of different ViT training strategies including pre-training, data augmentation and learning rate warm-ups followed by annealing, as pertaining to the neuroimaging domain. These techniques are essential for training ViT-like models for neuroimaging applications where training data is usually limited. We also analyzed the effect of the amount of training data utilized on the test-time performance of the ViT via data-model scaling curves.\",\"Neuroimaging of large populations is valuable to identify factors that promote or resist brain disease, and to assist diagnosis, subtyping, and prognosis. Data-driven models such as convolutional neural networks (CNNs) have increasingly been applied to brain images to perform diagnostic and prognostic tasks by learning robust features. Vision transformers (ViT) - a new class of deep learning architectures - have emerged in recent years as an alternative to CNNs for several computer vision applications. Here we tested variants of the ViT architecture for a range of desired neuroimaging downstream tasks based on difficulty, in this case for sex and Alzheimer's disease (AD) classification based on 3D brain MRI. In our experiments, two vision transformer architecture variants achieved an AUC of 0.987 for sex and 0.892 for AD classification, respectively. We independently evaluated our models on data from two benchmark AD datasets. We achieved a performance boost of 5% and 9-10% upon fine-tuning vision transformer models pre-trained on synthetic (generated by a latent diffusion model) and real MRI scans, respectively. Our main contributions include testing the effects of different ViT training strategies including pre-training, data augmentation and learning rate warm-ups followed by annealing, as pertaining to the neuroimaging domain. These techniques are essential for training ViT-like models for neuroimaging applications where training data is usually limited. We also analyzed the effect of the amount of training data utilized on the test-time performance of the ViT via data-model scaling curves.\",\"@Article{Dhinagar2023EfficientlyTV,\r\n author = {N. Dhinagar and S. Thomopoulos and Emily Laltoo and Paul M. Thompson},\r\n booktitle = {arXiv.org},\r\n journal = {ArXiv},\r\n title = {Efficiently Training Vision Transformers on Structural MRI Scans for Alzheimer's Disease Detection},\r\n year = {2023}\r\n}\r\n\",0,arXiv.org,ArXiv,\r\r\n7b2cf4599fdbfa7bb073ce3737e6350c8a7d8c3d,Vision Transformer Approach for Classification of Alzheimer’s Disease Using 18F-Florbetaben Brain Images,Hyunji Shin,2023,\"Dementia is a degenerative disease that is increasingly prevalent in an aging society. Alzheimer’s disease (AD), the most common type of dementia, is best mitigated via early detection and management. Deep learning is an artificial intelligence technique that has been used to diagnose and predict diseases by extracting meaningful features from medical images. The convolutional neural network (CNN) is a representative application of deep learning, serving as a powerful tool for the diagnosis of AD. Recently, vision transformers (ViT) have yielded classification performance exceeding that of CNN in some diagnostic image classifications. Because the brain is a very complex network with interrelated regions, ViT, which captures direct relationships between images, may be more effective for brain image analysis than CNN. Therefore, we propose a method for classifying dementia images by applying 18F-Florbetaben positron emission tomography (PET) images to ViT. Data were evaluated via binary (normal control and abnormal) and ternary (healthy control, mild cognitive impairment, and AD) classification. In a performance comparison with the CNN, VGG19 was selected as the comparison model. Consequently, ViT yielded more effective performance than VGG19 in binary classification. However, in ternary classification, the performance of ViT cannot be considered excellent. These results show that it is hard to argue that the ViT model is better at AD classification than the CNN model.\",\"Dementia is a degenerative disease that is increasingly prevalent in an aging society. Alzheimer’s disease (AD), the most common type of dementia, is best mitigated via early detection and management. Deep learning is an artificial intelligence technique that has been used to diagnose and predict diseases by extracting meaningful features from medical images. The convolutional neural network (CNN) is a representative application of deep learning, serving as a powerful tool for the diagnosis of AD. Recently, vision transformers (ViT) have yielded classification performance exceeding that of CNN in some diagnostic image classifications. Because the brain is a very complex network with interrelated regions, ViT, which captures direct relationships between images, may be more effective for brain image analysis than CNN. Therefore, we propose a method for classifying dementia images by applying 18F-Florbetaben positron emission tomography (PET) images to ViT. Data were evaluated via binary (normal control and abnormal) and ternary (healthy control, mild cognitive impairment, and AD) classification. In a performance comparison with the CNN, VGG19 was selected as the comparison model. Consequently, ViT yielded more effective performance than VGG19 in binary classification. However, in ternary classification, the performance of ViT cannot be considered excellent. These results show that it is hard to argue that the ViT model is better at AD classification than the CNN model.\",\"@Article{Shin2023VisionTA,\r\n author = {Hyunji Shin and Soomin Jeon and Youngsoo Seol and Sangjin Kim and Doyoung Kang},\r\n booktitle = {Applied Sciences},\r\n journal = {Applied Sciences},\r\n title = {Vision Transformer Approach for Classification of Alzheimer’s Disease Using 18F-Florbetaben Brain Images},\r\n year = {2023}\r\n}\r\n\",0,Applied Sciences,Applied Sciences,\r\r\n9f3cd0b46507d50caf71101f1dc3b23e339a8ba0,Comparative Analysis of Alzheimer's Disease Detection via MRI Scans Using Convolutional Neural Network and Vision Transformer,Pinky Sherwani,2023,\"Progressive damage to brain neurons is caused by a neurodegenerative disease (ND) that the body cannot heal or restore. Dementia like Alzheimer's Disease (AD), which affects millions of lives each year, is a well-known instance of such illnesses. Despite extensive research, the aforementioned disorders currently have no viable therapies. However, a timely diagnosis is essential for the management of diseases. For neurologists, diagnosing NDs is difficult and needs years of education and experience. The paper focuses on the detection of Alzheimer's Disease via Brain MRI Scans using Convolutional Neural Network (CNN) which minimizes an image's high dimensionality without sacrificing its information and using Vision Transformers that are for image classification and apply an architecture akin to a Transformer over selected areas of the image. In the proposed work three different Vision Transformer namely: Vanilla Vision Transformer (Vanilla ViT), Deep Vision Transformer (DeepViT), Class Attention in Image Transformer (CaiT). The Vision Transformers turned out to be better than CNN as when training on fewer datasets, ViT exhibits inductive bias, which increases dependency on model regularisation or data augmentation (AgReg). In terms of accuracy and computing efficiency, ViT models exceed the present CNN by almost a factor of four.\",\"Progressive damage to brain neurons is caused by a neurodegenerative disease (ND) that the body cannot heal or restore. Dementia like Alzheimer's Disease (AD), which affects millions of lives each year, is a well-known instance of such illnesses. Despite extensive research, the aforementioned disorders currently have no viable therapies. However, a timely diagnosis is essential for the management of diseases. For neurologists, diagnosing NDs is difficult and needs years of education and experience. The paper focuses on the detection of Alzheimer's Disease via Brain MRI Scans using Convolutional Neural Network (CNN) which minimizes an image's high dimensionality without sacrificing its information and using Vision Transformers that are for image classification and apply an architecture akin to a Transformer over selected areas of the image. In the proposed work three different Vision Transformer namely: Vanilla Vision Transformer (Vanilla ViT), Deep Vision Transformer (DeepViT), Class Attention in Image Transformer (CaiT). The Vision Transformers turned out to be better than CNN as when training on fewer datasets, ViT exhibits inductive bias, which increases dependency on model regularisation or data augmentation (AgReg). In terms of accuracy and computing efficiency, ViT models exceed the present CNN by almost a factor of four.\",\"@Conference{Sherwani2023ComparativeAO,\r\n author = {Pinky Sherwani and P. Nandhakumar and Pihu Srivastava and Jayant Jagtap and Viren Narvekar and H. R.},\r\n booktitle = {2023 International Conference on Artificial Intelligence and Knowledge Discovery in Concurrent Engineering (ICECONF)},\r\n journal = {2023 International Conference on Artificial Intelligence and Knowledge Discovery in Concurrent Engineering (ICECONF)},\r\n pages = {1-9},\r\n title = {Comparative Analysis of Alzheimer's Disease Detection via MRI Scans Using Convolutional Neural Network and Vision Transformer},\r\n year = {2023}\r\n}\r\n\",0,2023 International Conference on Artificial Intelligence and Knowledge Discovery in Concurrent Engineering (ICECONF),2023 International Conference on Artificial Intelligence and Knowledge Discovery in Concurrent Engineering (ICECONF),1-9\r\r\n6e6ee835b7388d52e2df688733a2952df38e13f6,Introducing Vision Transformer for Alzheimer's Disease classification task with 3D input,Zilun Zhang,2022,\". Many high-performance classification models utilize complex CNN-based architectures for Alzheimer’s Disease classification. We aim to investigate two relevant questions regarding classification of Alzheimer’s Disease using MRI: “Do Vision Transformer-based models perform better than CNN-based models?” and “Is it possible to use a shallow 3D CNN-based model to obtain satisfying results?” To achieve these goals, we propose two models that can take in and process 3D MRI scans: Convolutional Voxel Vision Transformer (CVVT) architecture, and ConvNet3D-4, a shallow 4-block 3D CNN-based model. Our results indicate that the shallow 3D CNN-based models are sufficient to achieve good classification results for Alzheimer’s Disease using MRI scans.\",\". Many high-performance classification models utilize complex CNN-based architectures for Alzheimer’s Disease classification. We aim to investigate two relevant questions regarding classification of Alzheimer’s Disease using MRI: “Do Vision Transformer-based models perform better than CNN-based models?” and “Is it possible to use a shallow 3D CNN-based model to obtain satisfying results?” To achieve these goals, we propose two models that can take in and process 3D MRI scans: Convolutional Voxel Vision Transformer (CVVT) architecture, and ConvNet3D-4, a shallow 4-block 3D CNN-based model. Our results indicate that the shallow 3D CNN-based models are sufficient to achieve good classification results for Alzheimer’s Disease using MRI scans.\",\"@Article{Zhang2022IntroducingVT,\r\n author = {Zilun Zhang and F. Khalvati},\r\n booktitle = {arXiv.org},\r\n journal = {ArXiv},\r\n title = {Introducing Vision Transformer for Alzheimer's Disease classification task with 3D input},\r\n volume = {abs/2210.01177},\r\n year = {2022}\r\n}\r\n\",0,arXiv.org,ArXiv,\r\r\n64023908cd9171ded5394d093f9cddb87acc0b59,Vision transformers for the prediction of mild cognitive impairment to Alzheimer’s disease progression using mid-sagittal sMRI,Gia-Minh Hoang,2023,\"Background Alzheimer’s disease (AD) is one of the most common causes of neurodegenerative disease affecting over 50 million people worldwide. However, most AD diagnosis occurs in the moderate to late stage, which means that the optimal time for treatment has already passed. Mild cognitive impairment (MCI) is an intermediate state between cognitively normal people and AD patients. Therefore, the accurate prediction in the conversion process of MCI to AD may allow patients to start preventive intervention to slow the progression of the disease. Nowadays, neuroimaging techniques have been developed and are used to determine AD-related structural biomarkers. Deep learning approaches have rapidly become a key methodology applied to these techniques to find biomarkers. Methods In this study, we aimed to investigate an MCI-to-AD prediction method using Vision Transformers (ViT) to structural magnetic resonance images (sMRI). The Alzheimer’s Disease Neuroimaging Initiative (ADNI) database containing 598 MCI subjects was used to predict MCI subjects’ progression to AD. There are three main objectives in our study: (i) to propose an MRI-based Vision Transformers approach for MCI to AD progression classification, (ii) to evaluate the performance of different ViT architectures to obtain the most advisable one, and (iii) to visualize the brain region mostly affect the prediction of deep learning approach to MCI progression. Results Our method achieved state-of-the-art classification performance in terms of accuracy (83.27%), specificity (85.07%), and sensitivity (81.48%) compared with a set of conventional methods. Next, we visualized the brain regions that mostly contribute to the prediction of MCI progression for interpretability of the proposed model. The discriminative pathological locations include the thalamus, medial frontal, and occipital—corroborating the reliability of our model. Conclusion In conclusion, our methods provide an effective and accurate technique for the prediction of MCI conversion to AD. The results obtained in this study outperform previous reports using the ADNI collection, and it suggests that sMRI-based ViT could be efficiently applied with a considerable potential benefit for AD patient management. The brain regions mostly contributing to prediction, in conjunction with the identified anatomical features, will support the building of a robust solution for other neurodegenerative diseases in future.\",\"Background Alzheimer’s disease (AD) is one of the most common causes of neurodegenerative disease affecting over 50 million people worldwide. However, most AD diagnosis occurs in the moderate to late stage, which means that the optimal time for treatment has already passed. Mild cognitive impairment (MCI) is an intermediate state between cognitively normal people and AD patients. Therefore, the accurate prediction in the conversion process of MCI to AD may allow patients to start preventive intervention to slow the progression of the disease. Nowadays, neuroimaging techniques have been developed and are used to determine AD-related structural biomarkers. Deep learning approaches have rapidly become a key methodology applied to these techniques to find biomarkers. Methods In this study, we aimed to investigate an MCI-to-AD prediction method using Vision Transformers (ViT) to structural magnetic resonance images (sMRI). The Alzheimer’s Disease Neuroimaging Initiative (ADNI) database containing 598 MCI subjects was used to predict MCI subjects’ progression to AD. There are three main objectives in our study: (i) to propose an MRI-based Vision Transformers approach for MCI to AD progression classification, (ii) to evaluate the performance of different ViT architectures to obtain the most advisable one, and (iii) to visualize the brain region mostly affect the prediction of deep learning approach to MCI progression. Results Our method achieved state-of-the-art classification performance in terms of accuracy (83.27%), specificity (85.07%), and sensitivity (81.48%) compared with a set of conventional methods. Next, we visualized the brain regions that mostly contribute to the prediction of MCI progression for interpretability of the proposed model. The discriminative pathological locations include the thalamus, medial frontal, and occipital—corroborating the reliability of our model. Conclusion In conclusion, our methods provide an effective and accurate technique for the prediction of MCI conversion to AD. The results obtained in this study outperform previous reports using the ADNI collection, and it suggests that sMRI-based ViT could be efficiently applied with a considerable potential benefit for AD patient management. The brain regions mostly contributing to prediction, in conjunction with the identified anatomical features, will support the building of a robust solution for other neurodegenerative diseases in future.\",\"@Article{Hoang2023VisionTF,\r\n author = {Gia-Minh Hoang and Ue-Hwan Kim and Jae Gwan Kim},\r\n booktitle = {Frontiers in Aging Neuroscience},\r\n journal = {Frontiers in Aging Neuroscience},\r\n title = {Vision transformers for the prediction of mild cognitive impairment to Alzheimer’s disease progression using mid-sagittal sMRI},\r\n volume = {15},\r\n year = {2023}\r\n}\r\n\",0,Frontiers in Aging Neuroscience,Frontiers in Aging Neuroscience,\r\r\n1b4ac29a08b2272bd6b3fbf293544816e9d31c5a,SMIL-DeiT:Multiple Instance Learning and Self-supervised Vision Transformer network for Early Alzheimer's disease classification,Yue Yin,2022,\"Early diagnosis of Alzheimer's disease(AD) is becoming increasingly important in preventing and treating the disease as the world's population ages. We proposed a SMIL-DeiT network for AD classification tasks amongst three groups: Alzheimer's Disease (AD), Mild Cognitive Impairment (MCI), and Normal Cognitive (NC) in this study. Vision Transformer is the fundamental structure of our work. The data pre-training is performed utilizing DINO, a self-supervised technique, whereas the downstream classification task is done with Multiple Instance Learning. Our proposed technique works on the ADNI dataset. We used four performance metrics accuracy rates, precision, recall, and Fl-score in the evaluation, the most important of which was accuracy. The accuracy obtained by our method is higher than the transformer's 90.1% and CNN's 90.8%, reaching 93.2%.\",\"Early diagnosis of Alzheimer's disease(AD) is becoming increasingly important in preventing and treating the disease as the world's population ages. We proposed a SMIL-DeiT network for AD classification tasks amongst three groups: Alzheimer's Disease (AD), Mild Cognitive Impairment (MCI), and Normal Cognitive (NC) in this study. Vision Transformer is the fundamental structure of our work. The data pre-training is performed utilizing DINO, a self-supervised technique, whereas the downstream classification task is done with Multiple Instance Learning. Our proposed technique works on the ADNI dataset. We used four performance metrics accuracy rates, precision, recall, and Fl-score in the evaluation, the most important of which was accuracy. The accuracy obtained by our method is higher than the transformer's 90.1% and CNN's 90.8%, reaching 93.2%.\",\"@Article{Yin2022SMILDeiTMultipleIL,\r\n author = {Yue Yin and Weikang Jin and Jing Bai and Ruotong Liu and Haowei Zhen},\r\n booktitle = {IEEE International Joint Conference on Neural Network},\r\n journal = {2022 International Joint Conference on Neural Networks (IJCNN)},\r\n pages = {1-6},\r\n title = {SMIL-DeiT:Multiple Instance Learning and Self-supervised Vision Transformer network for Early Alzheimer's disease classification},\r\n year = {2022}\r\n}\r\n\",0,IEEE International Joint Conference on Neural Network,2022 International Joint Conference on Neural Networks (IJCNN),1-6\r\r\n9da3fadf092c864f61d6fd1e8eab5a6ca2397194,Classification of Alzheimer's Disease via Vision Transformer: Classification of Alzheimer's Disease via Vision Transformer,Yanjun Lyu,2022,\"Deep models are powerful in capturing the complex and non-linear relationship buried in brain imaging data. However, the huge number of parameters in deep models can easily overfit given limited imaging data samples. In this work, we proposed a cross-domain transfer learning method to solve the insufficient data problem in brain imaging domain by leveraging the knowledge learned in natural image domain. Specifically, we employed ViT as the backbone and firstly pretrained it using ImageNet-21K dataset and then transferred to the brain imaging dataset. A slice-wise convolution embedding method was developed to improve the standard patch operation in vanilla ViT. Our method was evaluated based on AD/CN classification task. We also conducted extensive experiments to compare the transfer performance with different transfer strategies, models, and sample size. The results suggest that the proposed method can effectively transfer the knowledge learned in natural image domain to brain imaging area and may provide a promising way to take advantages of the pretrained model in data-intensive applications. Moreover, the proposed cross-domain transfer learning method can obtain comparable classification performance compared to most recent studies.\",\"Deep models are powerful in capturing the complex and non-linear relationship buried in brain imaging data. However, the huge number of parameters in deep models can easily overfit given limited imaging data samples. In this work, we proposed a cross-domain transfer learning method to solve the insufficient data problem in brain imaging domain by leveraging the knowledge learned in natural image domain. Specifically, we employed ViT as the backbone and firstly pretrained it using ImageNet-21K dataset and then transferred to the brain imaging dataset. A slice-wise convolution embedding method was developed to improve the standard patch operation in vanilla ViT. Our method was evaluated based on AD/CN classification task. We also conducted extensive experiments to compare the transfer performance with different transfer strategies, models, and sample size. The results suggest that the proposed method can effectively transfer the knowledge learned in natural image domain to brain imaging area and may provide a promising way to take advantages of the pretrained model in data-intensive applications. Moreover, the proposed cross-domain transfer learning method can obtain comparable classification performance compared to most recent studies.\",\"@Article{Lyu2022ClassificationOA,\r\n author = {Yanjun Lyu and Xiao-Wen Yu and Dajiang Zhu and Lu Zhang},\r\n booktitle = {Petra},\r\n journal = {Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments},\r\n title = {Classification of Alzheimer's Disease via Vision Transformer: Classification of Alzheimer's Disease via Vision Transformer},\r\n year = {2022}\r\n}\r\n\",0,Petra,Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments,\r\r\nadb781ca8eeea0f2cb4422b3fd632c0fd4ee2448,Advit: Vision Transformer On Multi-Modality Pet Images For Alzheimer Disease Diagnosis,Xin Xing,2022,\"We present a new model trained on multi-modalities of Positron Emission Tomography images (PET-AV45 and PET-FDG) for Alzheimer’s Disease (AD) diagnosis. Unlike the conventional methods using multi-modal 3D/2D CNN architecture, our design replaces the Convolutional Neural Net-work (CNN) by Vision Transformer (ViT). Considering the high computation cost of 3D images, we firstly employ a 3D-to-2D operation to project the 3D PET images into 2D fusion images. Then, we forward the fused multi-modal 2D images to a parallel ViT model for feature extraction, followed by classification for AD diagnosis. For evaluation, we use PET images from ADNI. The proposed model outperforms several strong baseline models in our experiments and achieves 0.91 accuracy and 0.95 AUC.\",\"We present a new model trained on multi-modalities of Positron Emission Tomography images (PET-AV45 and PET-FDG) for Alzheimer’s Disease (AD) diagnosis. Unlike the conventional methods using multi-modal 3D/2D CNN architecture, our design replaces the Convolutional Neural Net-work (CNN) by Vision Transformer (ViT). Considering the high computation cost of 3D images, we firstly employ a 3D-to-2D operation to project the 3D PET images into 2D fusion images. Then, we forward the fused multi-modal 2D images to a parallel ViT model for feature extraction, followed by classification for AD diagnosis. For evaluation, we use PET images from ADNI. The proposed model outperforms several strong baseline models in our experiments and achieves 0.91 accuracy and 0.95 AUC.\",\"@Article{Xing2022AdvitVT,\r\n author = {Xin Xing and G. Liang and Yu Zhang and Subash Khanal and Ai-Ling Lin and Nathan Jacobs},\r\n booktitle = {IEEE International Symposium on Biomedical Imaging},\r\n journal = {2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI)},\r\n pages = {1-4},\r\n title = {Advit: Vision Transformer On Multi-Modality Pet Images For Alzheimer Disease Diagnosis},\r\n year = {2022}\r\n}\r\n\",1,IEEE International Symposium on Biomedical Imaging,2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI),1-4\r\r\n0e681598a83e3e55cfbc34a07a222e99e9f4c445,CrossViT Wide Residual Squeeze-and-Excitation Network for Alzheimer's disease classification with self attention ProGAN data augmentation,Rahma Kadri,2022,\"Efficient and accurate early prediction of Alzheimer's disease (AD) based on the neuroimaging data has attracted interest from many researchers to prevent its progression. Deep learning networks have demonstrated an optimal ability to analyse large-scale multimodal neuroimaging for AD classification. The most widely used architecture of deep learning is the Convolution neural networks (CNN) that have shown great potential in AD detection. However CNN does not capture long range dependencies within the input image and does not ensure a good global feature extraction. Furthermore, increasing the receptive field of CNN by increasing the kernels sizes can cause a feature granularity loss. Another limitation is that CNN lacks a weighing mechanism of image features; the network doesn’t focus on the relevant features within the image. Recently,vision transformer have shown an outstanding performance over the CNN and overcomes its main limitations. The vision transformer relies on the self-attention layers. The main drawbacks of this new technique is that it requires a huge amount of training data. In this paper, we combined the main strengths of these two architectures for AD classification. We proposed a new method based on the combination of the Cross ViT and Wide Residual Squeeze-and-Excitation Network. We acquired MRI data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and the Open Access Series of Imaging Studies (OASIS). We also proposed a new data augmentation based on the self attention progressive generative adversarial neural network to overcome the limitation of the data. Our proposed method achieved 99% classification accuracy and outperforms CNN models.\",\"Efficient and accurate early prediction of Alzheimer's disease (AD) based on the neuroimaging data has attracted interest from many researchers to prevent its progression. Deep learning networks have demonstrated an optimal ability to analyse large-scale multimodal neuroimaging for AD classification. The most widely used architecture of deep learning is the Convolution neural networks (CNN) that have shown great potential in AD detection. However CNN does not capture long range dependencies within the input image and does not ensure a good global feature extraction. Furthermore, increasing the receptive field of CNN by increasing the kernels sizes can cause a feature granularity loss. Another limitation is that CNN lacks a weighing mechanism of image features; the network doesn’t focus on the relevant features within the image. Recently,vision transformer have shown an outstanding performance over the CNN and overcomes its main limitations. The vision transformer relies on the self-attention layers. The main drawbacks of this new technique is that it requires a huge amount of training data. In this paper, we combined the main strengths of these two architectures for AD classification. We proposed a new method based on the combination of the Cross ViT and Wide Residual Squeeze-and-Excitation Network. We acquired MRI data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and the Open Access Series of Imaging Studies (OASIS). We also proposed a new data augmentation based on the self attention progressive generative adversarial neural network to overcome the limitation of the data. Our proposed method achieved 99% classification accuracy and outperforms CNN models.\",\"@Article{Kadri2022CrossViTWR,\r\n author = {Rahma Kadri and B. Bouaziz and M. Tmar and F. Gargouri},\r\n booktitle = {International Journal of Hybrid Intelligent Systems},\r\n journal = {Int. J. Hybrid Intell. Syst.},\r\n pages = {163-177},\r\n title = {CrossViT Wide Residual Squeeze-and-Excitation Network for Alzheimer's disease classification with self attention ProGAN data augmentation},\r\n volume = {17},\r\n year = {2022}\r\n}\r\n\",0,International Journal of Hybrid Intelligent Systems,Int. J. Hybrid Intell. Syst.,163-177\r\r\naf73f6878044012034023640918cf31d7f4e1522,Comparison of Anatomical and Diffusion MRI for detecting Parkinson’s Disease using Deep Convolutional Neural Network,Tamoghna Chattopadhyay,2023,\"Parkinson’s disease (PD) is a progressive neurodegenerative disease that affects over 10 million people worldwide. Brain atrophy and microstructural abnormalities tend to be more subtle in PD than in other age-related conditions such as Alzheimer’s disease, so there is interest in how well machine learning methods can detect PD in radiological scans. Deep learning models based on convolutional neural networks (CNNs) can automatically distil diagnostically useful features from raw MRI scans, but most CNN-based deep learning models have only been tested on T1-weighted brain MRI. Here we examine the added value of diffusion-weighted MRI (dMRI) - a variant of MRI, sensitive to microstructural tissue properties - as an additional input in CNN-based models for PD classification. Our evaluations used data from 3 separate cohorts - from Chang Gung University, the University of Pennsylvania, and the PPMI dataset. We trained CNNs on various combinations of these cohorts to find the best predictive model. Although tests on more diverse data are warranted, deep-learned models from dMRI show promise for PD classification. Clinical Relevance This study supports the use of diffusion-weighted images as an alternative to anatomical images for AI-based detection of Parkinson’s disease.\",\"Parkinson’s disease (PD) is a progressive neurodegenerative disease that affects over 10 million people worldwide. Brain atrophy and microstructural abnormalities tend to be more subtle in PD than in other age-related conditions such as Alzheimer’s disease, so there is interest in how well machine learning methods can detect PD in radiological scans. Deep learning models based on convolutional neural networks (CNNs) can automatically distil diagnostically useful features from raw MRI scans, but most CNN-based deep learning models have only been tested on T1-weighted brain MRI. Here we examine the added value of diffusion-weighted MRI (dMRI) - a variant of MRI, sensitive to microstructural tissue properties - as an additional input in CNN-based models for PD classification. Our evaluations used data from 3 separate cohorts - from Chang Gung University, the University of Pennsylvania, and the PPMI dataset. We trained CNNs on various combinations of these cohorts to find the best predictive model. Although tests on more diverse data are warranted, deep-learned models from dMRI show promise for PD classification. Clinical Relevance This study supports the use of diffusion-weighted images as an alternative to anatomical images for AI-based detection of Parkinson’s disease.\",\"@Article{Chattopadhyay2023ComparisonOA,\r\n author = {Tamoghna Chattopadhyay and Amit Singh and Emily Laltoo and C. Boyle and Conor Owens-Walton and Yao‐Liang Chen and P. Cook and C. McMillan and Chih-Chien Tsai and J-J Wang and Yih-Ru Wu and Y. D. van der Werf and P. Thompson},\r\n booktitle = {bioRxiv},\r\n journal = {bioRxiv},\r\n title = {Comparison of Anatomical and Diffusion MRI for detecting Parkinson’s Disease using Deep Convolutional Neural Network},\r\n year = {2023}\r\n}\r\n\",0,bioRxiv,bioRxiv,\r\r\n003f528634735ece46286609f6a074fbfa435722,Pre-trained deep learning models for brain MRI image classification,Srigiri Krishnapriya,2023,\"Brain tumors are serious conditions caused by uncontrolled and abnormal cell division. Tumors can have devastating implications if not accurately and promptly detected. Magnetic resonance imaging (MRI) is one of the methods frequently used to detect brain tumors owing to its excellent resolution. In the past few decades, substantial research has been conducted in the field of classifying brain images, ranging from traditional methods to deep-learning techniques such as convolutional neural networks (CNN). To accomplish classification, machine-learning methods require manually created features. In contrast, CNN achieves classification by extracting visual features from unprocessed images. The size of the training dataset had a significant impact on the features that CNN extracts. The CNN tends to overfit when its size is small. Deep CNNs (DCNN) with transfer learning have therefore been developed. The aim of this work was to investigate the brain MR image categorization potential of pre-trained DCNN VGG-19, VGG-16, ResNet50, and Inception V3 models using data augmentation and transfer learning techniques. Validation of the test set utilizing accuracy, recall, Precision, and F1 score showed that the pre-trained VGG-19 model with transfer learning exhibited the best performance. In addition, these methods offer an end-to-end classification of raw images without the need for manual attribute extraction.\",\"Brain tumors are serious conditions caused by uncontrolled and abnormal cell division. Tumors can have devastating implications if not accurately and promptly detected. Magnetic resonance imaging (MRI) is one of the methods frequently used to detect brain tumors owing to its excellent resolution. In the past few decades, substantial research has been conducted in the field of classifying brain images, ranging from traditional methods to deep-learning techniques such as convolutional neural networks (CNN). To accomplish classification, machine-learning methods require manually created features. In contrast, CNN achieves classification by extracting visual features from unprocessed images. The size of the training dataset had a significant impact on the features that CNN extracts. The CNN tends to overfit when its size is small. Deep CNNs (DCNN) with transfer learning have therefore been developed. The aim of this work was to investigate the brain MR image categorization potential of pre-trained DCNN VGG-19, VGG-16, ResNet50, and Inception V3 models using data augmentation and transfer learning techniques. Validation of the test set utilizing accuracy, recall, Precision, and F1 score showed that the pre-trained VGG-19 model with transfer learning exhibited the best performance. In addition, these methods offer an end-to-end classification of raw images without the need for manual attribute extraction.\",\"@Article{Krishnapriya2023PretrainedDL,\r\n author = {Srigiri Krishnapriya and Y. Karuna},\r\n booktitle = {Frontiers in Human Neuroscience},\r\n journal = {Frontiers in Human Neuroscience},\r\n title = {Pre-trained deep learning models for brain MRI image classification},\r\n volume = {17},\r\n year = {2023}\r\n}\r\n\",0,Frontiers in Human Neuroscience,Frontiers in Human Neuroscience,\r\r\n"
  }
]