Spaces:

Amazetl
/

BattyBirdNET-Analyze-Demo

Runtime error

App Files Files Community

Dr. Richard Zinck commited on Aug 25, 2023

Commit

b87f798

1 Parent(s): 5788d0e

Basic files

Browse files

Files changed (8) hide show

LICENSE +360 -0
audio.py +118 -0
bat_gui.py +676 -0
bat_ident.py +616 -0
config.py +257 -0
model.py +389 -0
requirements.txt +12 -0
segments.py +305 -0

LICENSE ADDED Viewed

	@@ -0,0 +1,360 @@

+Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International
+Public License
+By exercising the Licensed Rights (defined below), You accept and agree
+to be bound by the terms and conditions of this Creative Commons
+Attribution-NonCommercial-ShareAlike 4.0 International Public License
+("Public License"). To the extent this Public License may be
+interpreted as a contract, You are granted the Licensed Rights in
+consideration of Your acceptance of these terms and conditions, and the
+Licensor grants You such rights in consideration of benefits the
+Licensor receives from making the Licensed Material available under
+these terms and conditions.
+Section 1 -- Definitions.
+  a. Adapted Material means material subject to Copyright and Similar
+     Rights that is derived from or based upon the Licensed Material
+     and in which the Licensed Material is translated, altered,
+     arranged, transformed, or otherwise modified in a manner requiring
+     permission under the Copyright and Similar Rights held by the
+     Licensor. For purposes of this Public License, where the Licensed
+     Material is a musical work, performance, or sound recording,
+     Adapted Material is always produced where the Licensed Material is
+     synched in timed relation with a moving image.
+  b. Adapter's License means the license You apply to Your Copyright
+     and Similar Rights in Your contributions to Adapted Material in
+     accordance with the terms and conditions of this Public License.
+  c. BY-NC-SA Compatible License means a license listed at
+     creativecommons.org/compatiblelicenses, approved by Creative
+     Commons as essentially the equivalent of this Public License.
+  d. Copyright and Similar Rights means copyright and/or similar rights
+     closely related to copyright including, without limitation,
+     performance, broadcast, sound recording, and Sui Generis Database
+     Rights, without regard to how the rights are labeled or
+     categorized. For purposes of this Public License, the rights
+     specified in Section 2(b)(1)-(2) are not Copyright and Similar
+     Rights.
+  e. Effective Technological Measures means those measures that, in the
+     absence of proper authority, may not be circumvented under laws
+     fulfilling obligations under Article 11 of the WIPO Copyright
+     Treaty adopted on December 20, 1996, and/or similar international
+     agreements.
+  f. Exceptions and Limitations means fair use, fair dealing, and/or
+     any other exception or limitation to Copyright and Similar Rights
+     that applies to Your use of the Licensed Material.
+  g. License Elements means the license attributes listed in the name
+     of a Creative Commons Public License. The License Elements of this
+     Public License are Attribution, NonCommercial, and ShareAlike.
+  h. Licensed Material means the artistic or literary work, database,
+     or other material to which the Licensor applied this Public
+     License.
+  i. Licensed Rights means the rights granted to You subject to the
+     terms and conditions of this Public License, which are limited to
+     all Copyright and Similar Rights that apply to Your use of the
+     Licensed Material and that the Licensor has authority to license.
+  j. Licensor means the individual(s) or entity(ies) granting rights
+     under this Public License.
+  k. NonCommercial means not primarily intended for or directed towards
+     commercial advantage or monetary compensation. For purposes of
+     this Public License, the exchange of the Licensed Material for
+     other material subject to Copyright and Similar Rights by digital
+     file-sharing or similar means is NonCommercial provided there is
+     no payment of monetary compensation in connection with the
+     exchange.
+  l. Share means to provide material to the public by any means or
+     process that requires permission under the Licensed Rights, such
+     as reproduction, public display, public performance, distribution,
+     dissemination, communication, or importation, and to make material
+     available to the public including in ways that members of the
+     public may access the material from a place and at a time
+     individually chosen by them.
+  m. Sui Generis Database Rights means rights other than copyright
+     resulting from Directive 96/9/EC of the European Parliament and of
+     the Council of 11 March 1996 on the legal protection of databases,
+     as amended and/or succeeded, as well as other essentially
+     equivalent rights anywhere in the world.
+  n. You means the individual or entity exercising the Licensed Rights
+     under this Public License. Your has a corresponding meaning.
+Section 2 -- Scope.
+  a. License grant.
+       1. Subject to the terms and conditions of this Public License,
+          the Licensor hereby grants You a worldwide, royalty-free,
+          non-sublicensable, non-exclusive, irrevocable license to
+          exercise the Licensed Rights in the Licensed Material to:
+            a. reproduce and Share the Licensed Material, in whole or
+               in part, for NonCommercial purposes only; and
+            b. produce, reproduce, and Share Adapted Material for
+               NonCommercial purposes only.
+       2. Exceptions and Limitations. For the avoidance of doubt, where
+          Exceptions and Limitations apply to Your use, this Public
+          License does not apply, and You do not need to comply with
+          its terms and conditions.
+       3. Term. The term of this Public License is specified in Section
+          6(a).
+       4. Media and formats; technical modifications allowed. The
+          Licensor authorizes You to exercise the Licensed Rights in
+          all media and formats whether now known or hereafter created,
+          and to make technical modifications necessary to do so. The
+          Licensor waives and/or agrees not to assert any right or
+          authority to forbid You from making technical modifications
+          necessary to exercise the Licensed Rights, including
+          technical modifications necessary to circumvent Effective
+          Technological Measures. For purposes of this Public License,
+          simply making modifications authorized by this Section 2(a)
+          (4) never produces Adapted Material.
+       5. Downstream recipients.
+            a. Offer from the Licensor -- Licensed Material. Every
+               recipient of the Licensed Material automatically
+               receives an offer from the Licensor to exercise the
+               Licensed Rights under the terms and conditions of this
+               Public License.
+            b. Additional offer from the Licensor -- Adapted Material.
+               Every recipient of Adapted Material from You
+               automatically receives an offer from the Licensor to
+               exercise the Licensed Rights in the Adapted Material
+               under the conditions of the Adapter's License You apply.
+            c. No downstream restrictions. You may not offer or impose
+               any additional or different terms or conditions on, or
+               apply any Effective Technological Measures to, the
+               Licensed Material if doing so restricts exercise of the
+               Licensed Rights by any recipient of the Licensed
+               Material.
+       6. No endorsement. Nothing in this Public License constitutes or
+          may be construed as permission to assert or imply that You
+          are, or that Your use of the Licensed Material is, connected
+          with, or sponsored, endorsed, or granted official status by,
+          the Licensor or others designated to receive attribution as
+          provided in Section 3(a)(1)(A)(i).
+  b. Other rights.
+       1. Moral rights, such as the right of integrity, are not
+          licensed under this Public License, nor are publicity,
+          privacy, and/or other similar personality rights; however, to
+          the extent possible, the Licensor waives and/or agrees not to
+          assert any such rights held by the Licensor to the limited
+          extent necessary to allow You to exercise the Licensed
+          Rights, but not otherwise.
+       2. Patent and trademark rights are not licensed under this
+          Public License.
+       3. To the extent possible, the Licensor waives any right to
+          collect royalties from You for the exercise of the Licensed
+          Rights, whether directly or through a collecting society
+          under any voluntary or waivable statutory or compulsory
+          licensing scheme. In all other cases the Licensor expressly
+          reserves any right to collect such royalties, including when
+          the Licensed Material is used other than for NonCommercial
+          purposes.
+Section 3 -- License Conditions.
+Your exercise of the Licensed Rights is expressly made subject to the
+following conditions.
+  a. Attribution.
+       1. If You Share the Licensed Material (including in modified
+          form), You must:
+            a. retain the following if it is supplied by the Licensor
+               with the Licensed Material:
+                 i. identification of the creator(s) of the Licensed
+                    Material and any others designated to receive
+                    attribution, in any reasonable manner requested by
+                    the Licensor (including by pseudonym if
+                    designated);
+                ii. a copyright notice;
+               iii. a notice that refers to this Public License;
+                iv. a notice that refers to the disclaimer of
+                    warranties;
+                 v. a URI or hyperlink to the Licensed Material to the
+                    extent reasonably practicable;
+            b. indicate if You modified the Licensed Material and
+               retain an indication of any previous modifications; and
+            c. indicate the Licensed Material is licensed under this
+               Public License, and include the text of, or the URI or
+               hyperlink to, this Public License.
+       2. You may satisfy the conditions in Section 3(a)(1) in any
+          reasonable manner based on the medium, means, and context in
+          which You Share the Licensed Material. For example, it may be
+          reasonable to satisfy the conditions by providing a URI or
+          hyperlink to a resource that includes the required
+          information.
+       3. If requested by the Licensor, You must remove any of the
+          information required by Section 3(a)(1)(A) to the extent
+          reasonably practicable.
+  b. ShareAlike.
+     In addition to the conditions in Section 3(a), if You Share
+     Adapted Material You produce, the following conditions also apply.
+       1. The Adapter's License You apply must be a Creative Commons
+          license with the same License Elements, this version or
+          later, or a BY-NC-SA Compatible License.
+       2. You must include the text of, or the URI or hyperlink to, the
+          Adapter's License You apply. You may satisfy this condition
+          in any reasonable manner based on the medium, means, and
+          context in which You Share Adapted Material.
+       3. You may not offer or impose any additional or different terms
+          or conditions on, or apply any Effective Technological
+          Measures to, Adapted Material that restrict exercise of the
+          rights granted under the Adapter's License You apply.
+Section 4 -- Sui Generis Database Rights.
+Where the Licensed Rights include Sui Generis Database Rights that
+apply to Your use of the Licensed Material:
+  a. for the avoidance of doubt, Section 2(a)(1) grants You the right
+     to extract, reuse, reproduce, and Share all or a substantial
+     portion of the contents of the database for NonCommercial purposes
+     only;
+  b. if You include all or a substantial portion of the database
+     contents in a database in which You have Sui Generis Database
+     Rights, then the database in which You have Sui Generis Database
+     Rights (but not its individual contents) is Adapted Material,
+     including for purposes of Section 3(b); and
+  c. You must comply with the conditions in Section 3(a) if You Share
+     all or a substantial portion of the contents of the database.
+For the avoidance of doubt, this Section 4 supplements and does not
+replace Your obligations under this Public License where the Licensed
+Rights include other Copyright and Similar Rights.
+Section 5 -- Disclaimer of Warranties and Limitation of Liability.
+  a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE
+     EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS
+     AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF
+     ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS,
+     IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION,
+     WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR
+     PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS,
+     ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT
+     KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT
+     ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU.
+  b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE
+     TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION,
+     NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT,
+     INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES,
+     COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR
+     USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN
+     ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR
+     DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR
+     IN PART, THIS LIMITATION MAY NOT APPLY TO YOU.
+  c. The disclaimer of warranties and limitation of liability provided
+     above shall be interpreted in a manner that, to the extent
+     possible, most closely approximates an absolute disclaimer and
+     waiver of all liability.
+Section 6 -- Term and Termination.
+  a. This Public License applies for the term of the Copyright and
+     Similar Rights licensed here. However, if You fail to comply with
+     this Public License, then Your rights under this Public License
+     terminate automatically.
+  b. Where Your right to use the Licensed Material has terminated under
+     Section 6(a), it reinstates:
+       1. automatically as of the date the violation is cured, provided
+          it is cured within 30 days of Your discovery of the
+          violation; or
+       2. upon express reinstatement by the Licensor.
+     For the avoidance of doubt, this Section 6(b) does not affect any
+     right the Licensor may have to seek remedies for Your violations
+     of this Public License.
+  c. For the avoidance of doubt, the Licensor may also offer the
+     Licensed Material under separate terms or conditions or stop
+     distributing the Licensed Material at any time; however, doing so
+     will not terminate this Public License.
+  d. Sections 1, 5, 6, 7, and 8 survive termination of this Public
+     License.
+Section 7 -- Other Terms and Conditions.
+  a. The Licensor shall not be bound by any additional or different
+     terms or conditions communicated by You unless expressly agreed.
+  b. Any arrangements, understandings, or agreements regarding the
+     Licensed Material not stated herein are separate from and
+     independent of the terms and conditions of this Public License.
+Section 8 -- Interpretation.
+  a. For the avoidance of doubt, this Public License does not, and
+     shall not be interpreted to, reduce, limit, restrict, or impose
+     conditions on any use of the Licensed Material that could lawfully
+     be made without permission under this Public License.
+  b. To the extent possible, if any provision of this Public License is
+     deemed unenforceable, it shall be automatically reformed to the
+     minimum extent necessary to make it enforceable. If the provision
+     cannot be reformed, it shall be severed from this Public License
+     without affecting the enforceability of the remaining terms and
+     conditions.
+  c. No term or condition of this Public License will be waived and no
+     failure to comply consented to unless expressly agreed to by the
+     Licensor.
+  d. Nothing in this Public License constitutes or may be interpreted
+     as a limitation upon, or waiver of, any privileges and immunities
+     that apply to the Licensor or You, including from the legal
+     processes of any jurisdiction or authority.

audio.py ADDED Viewed

	@@ -0,0 +1,118 @@

+"""Module containing audio helper functions.
+"""
+import numpy as np
+import config as cfg
+RANDOM = np.random.RandomState(cfg.RANDOM_SEED)
+def openAudioFile(path: str, sample_rate=cfg.SAMPLE_RATE, offset=0.0, duration=None):
+    """Open an audio file.
+    Opens an audio file with librosa and the given settings.
+    Args:
+        path: Path to the audio file.
+        sample_rate: The sample rate at which the file should be processed.
+        offset: The starting offset.
+        duration: Maximum duration of the loaded content.
+    Returns:
+        Returns the audio time series and the sampling rate.
+    """
+    # Open file with librosa (uses ffmpeg or libav)
+    import librosa
+    sig, rate = librosa.load(path, sr=sample_rate, offset=offset, duration=duration, mono=True, res_type="kaiser_fast")
+    return sig, rate
+def saveSignal(sig, fname: str):
+    """Saves a signal to file.
+    Args:
+        sig: The signal to be saved.
+        fname: The file path.
+    """
+    import soundfile as sf
+    sf.write(fname, sig, cfg.SAMPLE_RATE, "PCM_16")
+def noise(sig, shape, amount=None):
+    """Creates noise.
+    Creates a noise vector with the given shape.
+    Args:
+        sig: The original audio signal.
+        shape: Shape of the noise.
+        amount: The noise intensity.
+    Returns:
+        An numpy array of noise with the given shape.
+    """
+    # Random noise intensity
+    if amount == None:
+        amount = RANDOM.uniform(0.1, 0.5)
+    # Create Gaussian noise
+    try:
+        noise = RANDOM.normal(min(sig) * amount, max(sig) * amount, shape)
+    except:
+        noise = np.zeros(shape)
+    return noise.astype("float32")
+def splitSignal(sig, rate, seconds, overlap, minlen):
+    """Split signal with overlap.
+    Args:
+        sig: The original signal to be split.
+        rate: The sampling rate.
+        seconds: The duration of a segment.
+        overlap: The overlapping seconds of segments.
+        minlen: Minimum length of a split.
+    Returns:
+        A list of splits.
+    """
+    sig_splits = []
+    for i in range(0, len(sig), int((seconds - overlap) * rate)):
+        split = sig[i : i + int(seconds * rate)]
+        # End of signal?
+        if len(split) < int(minlen * rate):
+            break
+        # Signal chunk too short?
+        if len(split) < int(rate * seconds):
+            split = np.hstack((split, noise(split, (int(rate * seconds) - len(split)), 0.5)))
+        sig_splits.append(split)
+    return sig_splits
+def cropCenter(sig, rate, seconds):
+    """Crop signal to center.
+    Args:
+        sig: The original signal.
+        rate: The sampling rate.
+        seconds: The length of the signal.
+    """
+    if len(sig) > int(seconds * rate):
+        start = int((len(sig) - int(seconds * rate)) / 2)
+        end = start + int(seconds * rate)
+        sig = sig[start:end]
+    # Pad with noise
+    elif len(sig) < int(seconds * rate):
+        sig = np.hstack((sig, noise(sig, (int(seconds * rate) - len(sig)), 0.5)))
+    return sig

bat_gui.py ADDED Viewed

	@@ -0,0 +1,676 @@

+import concurrent.futures
+import os
+import sys
+from multiprocessing import freeze_support
+import gradio as gr
+import webview
+import bat_ident
+import config as cfg
+import segments
+import utils
+import logging
+import librosa
+logging.basicConfig(filename='bat_gui.log', encoding='utf-8', level=logging.DEBUG)
+_WINDOW: webview.Window
+_AREA_ONE = "EU"
+_AREA_TWO = "Bavaria"
+_AREA_THREE = "USA"
+_AREA_FOUR = "Scotland"
+_AREA_FIFE = "UK"
+#
+# MODEL part mixed with CONTROLER
+#
+OUTPUT_TYPE_MAP = {"Raven selection table": "table", "Audacity": "audacity", "R": "r", "CSV": "csv"}
+ORIGINAL_MODEL_PATH = cfg.MODEL_PATH
+ORIGINAL_MDATA_MODEL_PATH = cfg.MDATA_MODEL_PATH
+ORIGINAL_LABELS_FILE = cfg.LABELS_FILE
+ORIGINAL_TRANSLATED_LABELS_PATH = cfg.TRANSLATED_BAT_LABELS_PATH # cfg.TRANSLATED_LABELS_PATH
+def analyzeFile_wrapper(entry):
+    #return (entry[0], analyze.analyzeFile(entry))
+    return (entry[0], bat_ident.analyze_file(entry))
+def validate(value, msg):
+    """Checks if the value ist not falsy.
+    If the value is falsy, an error will be raised.
+    Args:
+        value: Value to be tested.
+        msg: Message in case of an error.
+    """
+    if not value:
+        raise gr.Error(msg)
+def runBatchAnalysis(
+    output_path,
+    confidence,
+    sensitivity,
+    overlap,
+    species_list_choice,
+    locale,
+    batch_size,
+    threads,
+    input_dir,
+    output_type_radio,
+    progress=gr.Progress(),
+):
+    validate(input_dir, "Please select a directory.")
+    batch_size = int(batch_size)
+    threads = int(threads)
+    return runAnalysis(
+        species_list_choice,
+        None,
+        output_path,
+        confidence,
+        sensitivity,
+        overlap,
+        output_type_radio,
+        "en" if not locale else locale,
+        batch_size,
+        threads,
+        input_dir,
+        progress,
+    )
+def runSingleFileAnalysis(input_path,
+                          confidence,
+                          sensitivity,
+                          overlap,
+                          species_list_choice,
+                          locale):
+    validate(input_path, "Please select a file.")
+    logging.info('first level')
+    return runAnalysis(
+        species_list_choice,
+        input_path,
+        None,
+        confidence,
+        sensitivity,
+        overlap,
+        "csv",
+        "en" if not locale else locale,
+        1,
+        4,
+        None,
+        progress=None,
+    )
+def runAnalysis(
+    species_list_choice: str,
+    input_path: str,
+    output_path: str | None,
+    confidence: float,
+    sensitivity: float,
+    overlap: float,
+    output_type: str,
+    locale: str,
+    batch_size: int,
+    threads: int,
+    input_dir: str,
+    progress: gr.Progress | None,
+):
+    """Starts the analysis.
+    Args:
+        input_path: Either a file or directory.
+        output_path: The output path for the result, if None the input_path is used
+        confidence: The selected minimum confidence.
+        sensitivity: The selected sensitivity.
+        overlap: The selected segment overlap.
+        species_list_choice: The choice for the species list.
+        species_list_file: The selected custom species list file.
+        lat: The selected latitude.
+        lon: The selected longitude.
+        week: The selected week of the year.
+        use_yearlong: Use yearlong instead of week.
+        sf_thresh: The threshold for the predicted species list.
+        custom_classifier_file: Custom classifier to be used.
+        output_type: The type of result to be generated.
+        locale: The translation to be used.
+        batch_size: The number of samples in a batch.
+        threads: The number of threads to be used.
+        input_dir: The input directory.
+        progress: The gradio progress bar.
+    """
+    logging.info('second level')
+    if progress is not None:
+        progress(0, desc="Preparing ...")
+    # locale = locale.lower()
+    # Load eBird codes, labels
+    #cfg.CODES = analyze.loadCodes()
+    # cfg.LABELS = utils.readLines(ORIGINAL_LABELS_FILE)
+    cfg.LATITUDE, cfg.LONGITUDE, cfg.WEEK = -1, -1, -1
+    cfg.LOCATION_FILTER_THRESHOLD = 0.03
+    script_dir = os.path.dirname(os.path.abspath(sys.argv[0]))
+    cfg.BAT_CLASSIFIER_LOCATION = os.path.join(script_dir, cfg.BAT_CLASSIFIER_LOCATION)
+    if species_list_choice == "Bavaria":
+        cfg.CUSTOM_CLASSIFIER = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-Bavaria-144kHz.tflite"
+        cfg.LABELS_FILE = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-Bavaria-144kHz_Labels.txt"
+        cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+        cfg.LATITUDE = -1
+        cfg.LONGITUDE = -1
+        cfg.SPECIES_LIST_FILE = None
+        cfg.SPECIES_LIST = []
+        locale = "de"
+    elif species_list_choice == "EU":
+        cfg.CUSTOM_CLASSIFIER = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-EU-144kHz.tflite"
+        cfg.LABELS_FILE = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-EU-144kHz_Labels.txt"
+        cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+        cfg.LATITUDE = -1
+        cfg.LONGITUDE = -1
+        cfg.SPECIES_LIST_FILE = None
+        cfg.SPECIES_LIST = []
+        locale = "en"
+    elif species_list_choice == "Scotland":
+        cfg.CUSTOM_CLASSIFIER = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-Scotland-144kHz.tflite"
+        cfg.LABELS_FILE = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-Scotland-144kHz_Labels.txt"
+        cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+        cfg.LATITUDE = -1
+        cfg.LONGITUDE = -1
+        cfg.SPECIES_LIST_FILE = None
+        cfg.SPECIES_LIST = []
+        locale = "en"
+    elif species_list_choice == "UK":
+        cfg.CUSTOM_CLASSIFIER = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-UK-144kHz.tflite"
+        cfg.LABELS_FILE = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-UK-144kHz_Labels.txt"
+        cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+        cfg.LATITUDE = -1
+        cfg.LONGITUDE = -1
+        cfg.SPECIES_LIST_FILE = None
+        cfg.SPECIES_LIST = []
+        locale = "en"
+    elif species_list_choice == "USA":
+        cfg.CUSTOM_CLASSIFIER = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-USA-144kHz.tflite"
+        cfg.LABELS_FILE = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-USA-144kHz_Labels.txt"
+        cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+        cfg.LATITUDE = -1
+        cfg.LONGITUDE = -1
+        cfg.SPECIES_LIST_FILE = None
+        cfg.SPECIES_LIST = []
+        locale = "en"
+    else:
+        cfg.CUSTOM_CLASSIFIER = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-EU-144kHz.tflite"
+        cfg.LABELS_FILE = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-EU-144kHz_Labels.txt"
+        cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+        cfg.LATITUDE = -1
+        cfg.LONGITUDE = -1
+        cfg.SPECIES_LIST_FILE = None
+        cfg.SPECIES_LIST = []
+        locale = "en"
+    # Load translated labels
+    lfile = os.path.join(cfg.TRANSLATED_BAT_LABELS_PATH,
+                         os.path.basename(cfg.LABELS_FILE).replace(".txt", f"_{locale}.txt"))
+    if not locale in ["en"] and os.path.isfile(lfile):
+        cfg.TRANSLATED_LABELS = utils.readLines(lfile)
+    else:
+        cfg.TRANSLATED_LABELS = cfg.LABELS
+    if len(cfg.SPECIES_LIST) == 0:
+        print(f"Species list contains {len(cfg.LABELS)} species")
+    else:
+        print(f"Species list contains {len(cfg.SPECIES_LIST)} species")
+    cfg.INPUT_PATH = input_path
+    if input_dir:
+        cfg.OUTPUT_PATH = output_path if output_path else input_dir
+    else:
+        cfg.OUTPUT_PATH = output_path if output_path else input_path.split(".", 1)[0] + ".csv"
+    # Parse input files
+    if input_dir:
+        cfg.FILE_LIST = utils.collect_audio_files(input_dir)
+        cfg.INPUT_PATH = input_dir
+    elif os.path.isdir(cfg.INPUT_PATH):
+        cfg.FILE_LIST = utils.collect_audio_files(cfg.INPUT_PATH)
+    else:
+        cfg.FILE_LIST = [cfg.INPUT_PATH]
+    validate(cfg.FILE_LIST, "No audio files found.")
+    cfg.MIN_CONFIDENCE = confidence
+    cfg.SIGMOID_SENSITIVITY = sensitivity
+    cfg.SIG_OVERLAP = overlap
+    # Set result type
+    cfg.RESULT_TYPE = OUTPUT_TYPE_MAP[output_type] if output_type in OUTPUT_TYPE_MAP else output_type.lower()
+    if not cfg.RESULT_TYPE in ["table", "audacity", "r", "csv"]:
+        cfg.RESULT_TYPE = "table"
+    # Set number of threads
+    if input_dir:
+        cfg.CPU_THREADS = max(1, int(threads))
+        cfg.TFLITE_THREADS = 1
+    else:
+        cfg.CPU_THREADS = 1
+        cfg.TFLITE_THREADS = max(1, int(threads))
+    # Set batch size
+    cfg.BATCH_SIZE = max(1, int(batch_size))
+    flist = []
+    for f in cfg.FILE_LIST:
+        flist.append((f, cfg.get_config()))
+    result_list = []
+    if progress is not None:
+        progress(0, desc="Starting ...")
+    # Analyze files
+    if cfg.CPU_THREADS < 2:
+        for entry in flist:
+            result = analyzeFile_wrapper(entry)
+            result_list.append(result)
+    else:
+        executor = None
+        with concurrent.futures.ProcessPoolExecutor(max_workers=cfg.CPU_THREADS) as executor:
+            futures = (executor.submit(analyzeFile_wrapper, arg) for arg in flist)
+            for i, f in enumerate(concurrent.futures.as_completed(futures), start=1):
+                if progress is not None:
+                    progress((i, len(flist)), total=len(flist), unit="files")
+                result = f.result()
+                result_list.append(result)
+    return [[os.path.relpath(r[0], input_dir), r[1]] for r in result_list] if input_dir else cfg.OUTPUT_PATH
+def extractSegments_wrapper(entry):
+    return (entry[0][0], segments.extractSegments(entry))
+def extract_segments(audio_dir, result_dir, output_dir, min_conf, num_seq, seq_length, threads, progress=gr.Progress()):
+    validate(audio_dir, "No audio directory selected")
+    if not result_dir:
+        result_dir = audio_dir
+    if not output_dir:
+        output_dir = audio_dir
+    if progress is not None:
+        progress(0, desc="Searching files ...")
+    # Parse audio and result folders
+    cfg.FILE_LIST = segments.parseFolders(audio_dir, result_dir)
+    # Set output folder
+    cfg.OUTPUT_PATH = output_dir
+    # Set number of threads
+    cfg.CPU_THREADS = int(threads)
+    # Set confidence threshold
+    cfg.MIN_CONFIDENCE = max(0.01, min(0.99, min_conf))
+    # Parse file list and make list of segments
+    cfg.FILE_LIST = segments.parseFiles(cfg.FILE_LIST, max(1, int(num_seq)))
+    # Add config items to each file list entry.
+    # We have to do this for Windows which does not
+    # support fork() and thus each process has to
+    # have its own config. USE LINUX!
+    flist = [(entry, max(cfg.SIG_LENGTH, float(seq_length)), cfg.get_config()) for entry in cfg.FILE_LIST]
+    result_list = []
+    # Extract segments
+    if cfg.CPU_THREADS < 2:
+        for i, entry in enumerate(flist):
+            result = extractSegments_wrapper(entry)
+            result_list.append(result)
+            if progress is not None:
+                progress((i, len(flist)), total=len(flist), unit="files")
+    else:
+        with concurrent.futures.ProcessPoolExecutor(max_workers=cfg.CPU_THREADS) as executor:
+            futures = (executor.submit(extractSegments_wrapper, arg) for arg in flist)
+            for i, f in enumerate(concurrent.futures.as_completed(futures), start=1):
+                if progress is not None:
+                    progress((i, len(flist)), total=len(flist), unit="files")
+                result = f.result()
+                result_list.append(result)
+    return [[os.path.relpath(r[0], audio_dir), r[1]] for r in result_list]
+def select_file(filetypes=()):
+    """Creates a file selection dialog.
+    Args:
+        filetypes: List of filetypes to be filtered in the dialog.
+    Returns:
+        The selected file or None of the dialog was canceled.
+    """
+    files = _WINDOW.create_file_dialog(webview.OPEN_DIALOG, file_types=filetypes)
+    return files[0] if files else None
+def format_seconds(secs: float):
+    """Formats a number of seconds into a string.
+    Formats the seconds into the format "h:mm:ss.ms"
+    Args:
+        secs: Number of seconds.
+    Returns:
+        A string with the formatted seconds.
+    """
+    hours, secs = divmod(secs, 3600)
+    minutes, secs = divmod(secs, 60)
+    return "{:2.0f}:{:02.0f}:{:06.3f}".format(hours, minutes, secs)
+def select_directory(collect_files=True):
+    """Shows a directory selection system dialog.
+    Uses the pywebview to create a system dialog.
+    Args:
+        collect_files: If True, also lists a files inside the directory.
+    Returns:
+        If collect_files==True, returns (directory path, list of (relative file path, audio length))
+        else just the directory path.
+        All values will be None of the dialog is cancelled.
+    """
+    dir_name = _WINDOW.create_file_dialog(webview.FOLDER_DIALOG)
+    if collect_files:
+        if not dir_name:
+            return None, None
+        files = utils.collect_audio_files(dir_name[0])
+        return dir_name[0], [
+            [os.path.relpath(file, dir_name[0]), format_seconds(librosa.get_duration(filename=file))] for file in files
+        ]
+    return dir_name[0] if dir_name else None
+def show_species_choice(choice: str):
+    """Sets the visibility of the species list choices.
+    Args:
+        choice: The label of the currently active choice.
+    Returns:
+        A list of [
+            Row update,
+            File update,
+            Column update,
+            Column update,
+        ]
+    """
+    return [
+        gr.Row.update(visible=True),
+        gr.File.update(visible=False),
+        gr.Column.update(visible=False),
+        gr.Column.update(visible=False),
+    ]
+#
+# VIEW - This is where the UI elements are defined
+#
+def sample_sliders(opened=True):
+    """Creates the gradio accordion for the inference settings.
+    Args:
+        opened: If True the accordion is open on init.
+    Returns:
+        A tuple with the created elements:
+        (Slider (min confidence), Slider (sensitivity), Slider (overlap))
+    """
+    with gr.Accordion("Inference settings", open=opened):
+        with gr.Row():
+            confidence_slider = gr.Slider(
+                minimum=0, maximum=1, value=0.5, step=0.01, label="Minimum Confidence", info="Minimum confidence threshold."
+            )
+            sensitivity_slider = gr.Slider(
+                minimum=0.5,
+                maximum=1.5,
+                value=1,
+                step=0.01,
+                label="Sensitivity",
+                info="Detection sensitivity; Higher values result in higher sensitivity.",
+            )
+            overlap_slider = gr.Slider(
+                minimum=0, maximum=2.99, value=0, step=0.01, label="Overlap", info="Overlap of prediction segments."
+            )
+    return confidence_slider, sensitivity_slider, overlap_slider
+def locale():
+    """Creates the gradio elements for locale selection
+    Reads the translated labels inside the checkpoints directory.
+    Returns:
+        The dropdown element.
+    """
+    label_files = os.listdir(os.path.join(os.path.dirname(sys.argv[0]), ORIGINAL_TRANSLATED_LABELS_PATH))
+    options = ["EN"] + [label_file.rsplit("_", 1)[-1].split(".")[0].upper() for label_file in label_files]
+    return gr.Dropdown(options, value="EN", label="Locale", info="Locale for the translated species common names.",visible=False)
+def species_lists(opened=True):
+    """Creates the gradio accordion for species selection.
+    Args:
+        opened: If True the accordion is open on init.
+    Returns:
+        A tuple with the created elements:
+        (Radio (choice), File (custom species list), Slider (lat), Slider (lon), Slider (week), Slider (threshold), Checkbox (yearlong?), State (custom classifier))
+    """
+    with gr.Accordion("Area selection", open=opened):
+        with gr.Row():
+            species_list_radio = gr.Radio(
+                [_AREA_ONE, _AREA_TWO, _AREA_THREE, _AREA_FOUR, _AREA_FIFE],
+                value="All regions",
+                label="Regions list",
+                info="List of all possible regions",
+                elem_classes="d-block",
+            )
+            # species_list_radio.change(
+            #     show_species_choice,
+            #     inputs=[species_list_radio],
+            #     outputs=[ ],
+            #     show_progress=False,
+            # )
+            #
+    return species_list_radio
+#
+# Design main frame for analysis of a single file
+#
+def build_single_analysis_tab():
+    with gr.Tab("Single file"):
+        audio_input = gr.Audio(type="filepath", label="file", elem_id="single_file_audio")
+        confidence_slider, sensitivity_slider, overlap_slider = sample_sliders(False)
+        species_list_radio = species_lists(False)
+        locale_radio = locale()
+        inputs = [
+            audio_input,
+            confidence_slider,
+            sensitivity_slider,
+            overlap_slider,
+            species_list_radio,
+            locale_radio
+        ]
+        output_dataframe = gr.Dataframe(
+            type="pandas",
+            headers=["Start (s)", "End (s)", "Scientific name", "Common name", "Confidence"],
+            elem_classes="mh-200",
+        )
+        single_file_analyze = gr.Button("Analyze")
+        single_file_analyze.click(runSingleFileAnalysis,
+                                  inputs=inputs,
+                                  outputs=output_dataframe,
+                                  )
+def build_multi_analysis_tab():
+    with gr.Tab("Multiple files"):
+        input_directory_state = gr.State()
+        output_directory_predict_state = gr.State()
+        with gr.Row():
+            with gr.Column():
+                select_directory_btn = gr.Button("Select directory (recursive)")
+                directory_input = gr.Matrix(interactive=False, elem_classes="mh-200", headers=["Subpath", "Length"])
+                def select_directory_on_empty():
+                    res = select_directory()
+                    return res if res[1] else [res[0], [["No files found"]]]
+                select_directory_btn.click(
+                    select_directory_on_empty, outputs=[input_directory_state, directory_input], show_progress=True
+                )
+            with gr.Column():
+                select_out_directory_btn = gr.Button("Select output directory.")
+                selected_out_textbox = gr.Textbox(
+                    label="Output directory",
+                    interactive=False,
+                    placeholder="If not selected, the input directory will be used.",
+                )
+                def select_directory_wrapper():
+                    return (select_directory(collect_files=False),) * 2
+                select_out_directory_btn.click(
+                    select_directory_wrapper,
+                    outputs=[output_directory_predict_state, selected_out_textbox],
+                    show_progress=False,
+                )
+        confidence_slider, sensitivity_slider, overlap_slider = sample_sliders()
+        species_list_radio = species_lists(False)
+        output_type_radio = gr.Radio(
+            list(OUTPUT_TYPE_MAP.keys()),
+            value="Raven selection table",
+            label="Result type",
+            info="Specifies output format.",
+        )
+        with gr.Row():
+            batch_size_number = gr.Number(
+                precision=1, label="Batch size", value=1, info="Number of samples to process at the same time."
+            )
+            threads_number = gr.Number(precision=1, label="Threads", value=4, info="Number of CPU threads.")
+        locale_radio = locale()
+        start_batch_analysis_btn = gr.Button("Analyze")
+        result_grid = gr.Matrix(headers=["File", "Execution"], elem_classes="mh-200")
+        inputs = [
+            output_directory_predict_state,
+            confidence_slider,
+            sensitivity_slider,
+            overlap_slider,
+            species_list_radio,
+            locale_radio,
+            batch_size_number,
+            threads_number,
+            input_directory_state,
+            output_type_radio
+        ]
+        start_batch_analysis_btn.click(runBatchAnalysis, inputs=inputs, outputs=result_grid)
+def build_segments_tab():
+    with gr.Tab("Segments"):
+        audio_directory_state = gr.State()
+        result_directory_state = gr.State()
+        output_directory_state = gr.State()
+        def select_directory_to_state_and_tb():
+            return (select_directory(collect_files=False),) * 2
+        with gr.Row():
+            select_audio_directory_btn = gr.Button("Select audio directory (recursive)")
+            selected_audio_directory_tb = gr.Textbox(show_label=False, interactive=False)
+            select_audio_directory_btn.click(
+                select_directory_to_state_and_tb,
+                outputs=[selected_audio_directory_tb, audio_directory_state],
+                show_progress=False,
+            )
+        with gr.Row():
+            select_result_directory_btn = gr.Button("Select result directory")
+            selected_result_directory_tb = gr.Textbox(
+                show_label=False, interactive=False, placeholder="Same as audio directory if not selected"
+            )
+            select_result_directory_btn.click(
+                select_directory_to_state_and_tb,
+                outputs=[result_directory_state, selected_result_directory_tb],
+                show_progress=False,
+            )
+        with gr.Row():
+            select_output_directory_btn = gr.Button("Select output directory")
+            selected_output_directory_tb = gr.Textbox(
+                show_label=False, interactive=False, placeholder="Same as audio directory if not selected"
+            )
+            select_output_directory_btn.click(
+                select_directory_to_state_and_tb,
+                outputs=[selected_output_directory_tb, output_directory_state],
+                show_progress=False,
+            )
+        min_conf_slider = gr.Slider(
+            minimum=0.1, maximum=0.99, step=0.01, label="Minimum confidence", info="Minimum confidence threshold."
+        )
+        num_seq_number = gr.Number(
+            100, label="Max number of segments", info="Maximum number of randomly extracted segments per species."
+        )
+        seq_length_number = gr.Number(3.0, label="Sequence length", info="Length of extracted segments in seconds.")
+        threads_number = gr.Number(4, label="Threads", info="Number of CPU threads.")
+        extract_segments_btn = gr.Button("Extract segments")
+        result_grid = gr.Matrix(headers=["File", "Execution"], elem_classes="mh-200")
+        extract_segments_btn.click(
+            extract_segments,
+            inputs=[
+                audio_directory_state,
+                result_directory_state,
+                output_directory_state,
+                min_conf_slider,
+                num_seq_number,
+                seq_length_number,
+                threads_number,
+            ],
+            outputs=result_grid,
+        )
+if __name__ == "__main__":
+    freeze_support()
+    with gr.Blocks(
+        css=r".d-block .wrap {display: block !important;} .mh-200 {max-height: 300px; overflow-y: auto !important;} footer {display: none !important;} #single_file_audio, #single_file_audio * {max-height: 81.6px; min-height: 0;}",
+        theme=gr.themes.Default(),
+        analytics_enabled=False,
+    ) as demo:
+        build_single_analysis_tab()
+        build_multi_analysis_tab()
+        build_segments_tab()
+    url = demo.queue(api_open=False).launch(prevent_thread_lock=True, quiet=True)[1]
+    #_WINDOW = webview.create_window("BattyBirdNET-Analyzer", url.rstrip("/") +
+    #                                "?__theme=light", min_size=(1024, 768))
+    # webview.start(private_mode=False)

bat_ident.py ADDED Viewed

	@@ -0,0 +1,616 @@

+"""Module to analyze audio samples.
+"""
+import argparse
+import datetime
+import json
+import operator
+import os
+import sys
+from multiprocessing import Pool, freeze_support
+import numpy as np
+import audio
+import config as cfg
+import model
+import species
+import utils
+import subprocess
+import pathlib
+def load_codes():
+    """Loads the eBird codes.
+    Returns:
+        A dictionary containing the eBird codes.
+    """
+    with open(cfg.CODES_FILE, "r") as cfile:
+        codes = json.load(cfile)
+    return codes
+def save_result_file(r: dict[str, list], path: str, afile_path: str):
+    """Saves the results to the hard drive.
+    Args:
+        r: The dictionary with {segment: scores}.
+        path: The path where the result should be saved.
+        afile_path: The path to audio file.
+    """
+    # Make folder if it doesn't exist
+    if os.path.dirname(path):
+        os.makedirs(os.path.dirname(path), exist_ok=True)
+    # Selection table
+    out_string = ""
+    if cfg.RESULT_TYPE == "table":
+        # Raven selection header
+        header = "Selection\tView\tChannel\tBegin Time (s)\tEnd Time (s)\tSpecies Code\tCommon Name\tConfidence\n"
+        selection_id = 0
+        # Write header
+        out_string += header
+        # Extract valid predictions for every timestamp
+        for timestamp in get_sorted_timestamps(r):
+            rstring = ""
+            start, end = timestamp.split("-", 1)
+            for c in r[timestamp]:
+                if c[1] > cfg.MIN_CONFIDENCE and (not cfg.SPECIES_LIST or c[0] in cfg.SPECIES_LIST):
+                    selection_id += 1
+                    label = cfg.TRANSLATED_LABELS[cfg.LABELS.index(c[0])]
+                    rstring += "{}\tSpectrogram 1\t1\t{}\t{}\t{}\t{}\t{:.4f}\n".format(
+                        selection_id,
+                        start,
+                        end,
+                        cfg.CODES[c[0]] if c[0] in cfg.CODES else c[0],
+                        label.split("_", 1)[-1],
+                        c[1],
+                    )
+            # Write result string to file
+            out_string += rstring
+    elif cfg.RESULT_TYPE == "audacity":
+        # Audacity timeline labels
+        for timestamp in get_sorted_timestamps(r):
+            rstring = ""
+            for c in r[timestamp]:
+                if c[1] > cfg.MIN_CONFIDENCE and (not cfg.SPECIES_LIST or c[0] in cfg.SPECIES_LIST):
+                    label = cfg.TRANSLATED_LABELS[cfg.LABELS.index(c[0])]
+                    rstring += "{}\t{}\t{:.4f}\n".format(timestamp.replace("-", "\t"), label.replace("_", ", "), c[1])
+            # Write result string to file
+            out_string += rstring
+    elif cfg.RESULT_TYPE == "r":
+        # Output format for R
+        header = ("filepath,start,end,scientific_name,common_name,confidence,lat,lon,week,"
+                  "overlap,sensitivity,min_conf,species_list,model")
+        out_string += header
+        for timestamp in get_sorted_timestamps(r):
+            rstring = ""
+            start, end = timestamp.split("-", 1)
+            for c in r[timestamp]:
+                if c[1] > cfg.MIN_CONFIDENCE and (not cfg.SPECIES_LIST or c[0] in cfg.SPECIES_LIST):
+                    label = cfg.TRANSLATED_LABELS[cfg.LABELS.index(c[0])]
+                    rstring += "\n{},{},{},{},{},{:.4f},{:.4f},{:.4f},{},{},{},{},{},{}".format(
+                        afile_path,
+                        start,
+                        end,
+                        label.split("_", 1)[0],
+                        label.split("_", 1)[-1],
+                        c[1],
+                        cfg.LATITUDE,
+                        cfg.LONGITUDE,
+                        cfg.WEEK,
+                        cfg.SIG_OVERLAP,
+                        (1.0 - cfg.SIGMOID_SENSITIVITY) + 1.0,
+                        cfg.MIN_CONFIDENCE,
+                        cfg.SPECIES_LIST_FILE,
+                        os.path.basename(cfg.MODEL_PATH),
+                    )
+            # Write result string to file
+            out_string += rstring
+    elif cfg.RESULT_TYPE == "kaleidoscope":
+        # Output format for kaleidoscope
+        header = ("INDIR,FOLDER,IN FILE,OFFSET,DURATION,scientific_name,"
+                  "common_name,confidence,lat,lon,week,overlap,sensitivity")
+        out_string += header
+        folder_path, filename = os.path.split(afile_path)
+        parent_folder, folder_name = os.path.split(folder_path)
+        for timestamp in get_sorted_timestamps(r):
+            rstring = ""
+            start, end = timestamp.split("-", 1)
+            for c in r[timestamp]:
+                if c[1] > cfg.MIN_CONFIDENCE and (not cfg.SPECIES_LIST or c[0] in cfg.SPECIES_LIST):
+                    label = cfg.TRANSLATED_LABELS[cfg.LABELS.index(c[0])]
+                    rstring += "\n{},{},{},{},{},{},{},{:.4f},{:.4f},{:.4f},{},{},{}".format(
+                        parent_folder.rstrip("/"),
+                        folder_name,
+                        filename,
+                        start,
+                        float(end) - float(start),
+                        label.split("_", 1)[0],
+                        label.split("_", 1)[-1],
+                        c[1],
+                        cfg.LATITUDE,
+                        cfg.LONGITUDE,
+                        cfg.WEEK,
+                        cfg.SIG_OVERLAP,
+                        (1.0 - cfg.SIGMOID_SENSITIVITY) + 1.0,
+                    )
+            # Write result string to file
+            out_string += rstring
+    else:
+        # CSV output file
+        header = "Start (s),End (s),Scientific name,Common name,Confidence\n"
+        # Write header
+        out_string += header
+        for timestamp in get_sorted_timestamps(r):
+            rstring = ""
+            for c in r[timestamp]:
+                start, end = timestamp.split("-", 1)
+                if c[1] > cfg.MIN_CONFIDENCE and (not cfg.SPECIES_LIST or c[0] in cfg.SPECIES_LIST):
+                    label = cfg.TRANSLATED_LABELS[cfg.LABELS.index(c[0])]
+                    rstring += "{},{},{},{},{:.4f}\n".format(start, end, label.split("_", 1)[0],
+                                                             label.split("_", 1)[-1], c[1])
+            # Write result string to file
+            out_string += rstring
+    # Save as file
+    with open(path, "w", encoding="utf-8") as rfile:
+        rfile.write(out_string)
+    return out_string
+def get_sorted_timestamps(results: dict[str, list]):
+    """Sorts the results based on the segments.
+    Args:
+        results: The dictionary with {segment: scores}.
+    Returns:
+        Returns the sorted list of segments and their scores.
+    """
+    return sorted(results, key=lambda t: float(t.split("-", 1)[0]))
+def get_raw_audio_from_file(fpath: str):
+    """Reads an audio file.
+    Reads the file and splits the signal into chunks.
+    Args:
+        fpath: Path to the audio file.
+    Returns:
+        The signal split into a list of chunks.
+    """
+    # Open file
+    sig, rate = audio.openAudioFile(fpath, cfg.SAMPLE_RATE)
+    # Split into raw audio chunks
+    chunks = audio.splitSignal(sig, rate, cfg.SIG_LENGTH, cfg.SIG_OVERLAP, cfg.SIG_MINLEN)
+    return chunks
+def predict(samples):
+    """Predicts the classes for the given samples.
+    Args:
+        samples: Samples to be predicted.
+    Returns:
+        The prediction scores.
+    """
+    # Prepare sample and pass through model
+    data = np.array(samples, dtype="float32")
+    prediction = model.predict(data)
+    # Logits or sigmoid activations?
+    if cfg.APPLY_SIGMOID:
+        prediction = model.flat_sigmoid(np.array(prediction), sensitivity=-cfg.SIGMOID_SENSITIVITY)
+    return prediction
+def analyze_file(item):
+    """Analyzes a file.
+    Predicts the scores for the file and saves the results.
+    Args:
+        item: Tuple containing (file path, config)
+    Returns:
+        The `True` if the file was analyzed successfully.
+    """
+    # Get file path and restore cfg
+    fpath: str = item[0]
+    cfg.set_config(item[1])
+    # Start time
+    start_time = datetime.datetime.now()
+    # Status
+    print(f"Analyzing {fpath}", flush=True)
+    try:
+        # Open audio file and split into 3-second chunks
+        chunks = get_raw_audio_from_file(fpath)
+    # If no chunks, show error and skip
+    except Exception as ex:
+        print(f"Error: Cannot open audio file {fpath}", flush=True)
+        utils.writeErrorLog(ex)
+        return False
+    # Process each chunk
+    try:
+        start, end = 0, cfg.SIG_LENGTH
+        results = {}
+        samples = []
+        timestamps = []
+        for chunk_index, chunk in enumerate(chunks):
+            # Add to batch
+            samples.append(chunk)
+            timestamps.append([start, end])
+            # Advance start and end
+            start += cfg.SIG_LENGTH - cfg.SIG_OVERLAP
+            end = start + cfg.SIG_LENGTH
+            # Check if batch is full or last chunk
+            if len(samples) < cfg.BATCH_SIZE and chunk_index < len(chunks) - 1:
+                continue
+            # Predict
+            prediction = predict(samples)
+            # Add to results
+            for i in range(len(samples)):
+                # Get timestamp
+                s_start, s_end = timestamps[i]
+                # Get prediction
+                pred = prediction[i]
+                # Assign scores to labels
+                p_labels = zip(cfg.LABELS, pred)
+                # Sort by score
+                p_sorted = sorted(p_labels, key=operator.itemgetter(1), reverse=True)
+                # Store top 5 results and advance indices
+                results[str(s_start) + "-" + str(s_end)] = p_sorted
+            # Clear batch
+            samples = []
+            timestamps = []
+    except Exception as ex:
+        # Write error log
+        print(f"Error: Cannot analyze audio file {fpath}.\n", flush=True)
+        utils.writeErrorLog(ex)
+        return False
+    # Save as selection table
+    try:
+        # We have to check if output path is a file or directory
+        if not cfg.OUTPUT_PATH.rsplit(".", 1)[-1].lower() in ["txt", "csv"]:
+            rpath = fpath.replace(cfg.INPUT_PATH, "")
+            rpath = rpath[1:] if rpath[0] in ["/", "\\"] else rpath
+            # Make target directory if it doesn't exist
+            rdir = os.path.join(cfg.OUTPUT_PATH, os.path.dirname(rpath))
+            os.makedirs(rdir, exist_ok=True)
+            if cfg.RESULT_TYPE == "table":
+                rtype = "bat.selection.table.txt"
+            elif cfg.RESULT_TYPE == "audacity":
+                rtype = ".bat.results.txt"
+            else:
+                rtype = ".bat.results.csv"
+            out_string = save_result_file(results, os.path.join(cfg.OUTPUT_PATH, rpath.rsplit(".", 1)[0] + rtype), fpath)
+        else:
+            out_string = save_result_file(results, cfg.OUTPUT_PATH, fpath)
+            # Save as file
+        with open(cfg.OUTPUT_PATH + "Results.csv", "a", encoding="utf-8") as rfile:
+            postString = out_string.split("\n", 1)[1]
+            # rfile.write(fpath.join(postString.splitlines(True)))
+            rfile.write("\n"+fpath+"\n")
+            rfile.write(postString)
+    except Exception as ex:
+        # Write error log
+        print(f"Error: Cannot save result for {fpath}.\n", flush=True)
+        utils.writeErrorLog(ex)
+        return False
+    delta_time = (datetime.datetime.now() - start_time).total_seconds()
+    print("Finished {} in {:.2f} seconds".format(fpath, delta_time), flush=True)
+    return True
+def set_analysis_location():
+    if args.area not in ["Bavaria", "Sweden", "EU", "Scotland", "UK", "USA","MarinCounty"]:
+        exit(code="Unknown location option.")
+    else:
+        args.lat = -1
+        args.lon = -1
+        # args.locale = "en"
+    if args.area == "Bavaria":
+        cfg.CUSTOM_CLASSIFIER = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-Bavaria-144kHz.tflite"
+        cfg.LABELS_FILE = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-Bavaria-144kHz_Labels.txt"
+        cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+        args.locale = "de"
+    elif args.area == "EU":
+        cfg.CUSTOM_CLASSIFIER = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-EU-144kHz.tflite"
+        cfg.LABELS_FILE = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-EU-144kHz_Labels.txt"
+        cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+    elif args.area == "Sweden":
+        cfg.CUSTOM_CLASSIFIER = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-Sweden-144kHz.tflite"
+        cfg.LABELS_FILE = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-Sweden-144kHz_Labels.txt"
+        cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+        args.locale = "se"
+    elif args.area == "Scotland":
+        cfg.CUSTOM_CLASSIFIER = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-Scotland-144kHz.tflite"
+        cfg.LABELS_FILE = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-Scotland-144kHz_Labels.txt"
+        cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+    elif args.area == "UK":
+        cfg.CUSTOM_CLASSIFIER = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-UK-144kHz.tflite"
+        cfg.LABELS_FILE = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-UK-144kHz_Labels.txt"
+        cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+    elif args.area == "USA":
+        cfg.CUSTOM_CLASSIFIER = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-USA-144kHz.tflite"
+        cfg.LABELS_FILE = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-USA-144kHz_Labels.txt"
+        cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+    elif args.area == "MarinCounty":
+        cfg.CUSTOM_CLASSIFIER = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-MarinCounty-144kHz.tflite"
+        cfg.LABELS_FILE = cfg.BAT_CLASSIFIER_LOCATION + "/BattyBirdNET-MarinCounty-144kHz_Labels.txt"
+        cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+    else:
+        cfg.CUSTOM_CLASSIFIER = None
+def set_paths():
+    # Set paths relative to script path (requested in #3)
+    script_dir = os.path.dirname(os.path.abspath(sys.argv[0]))
+    cfg.MODEL_PATH = os.path.join(script_dir, cfg.MODEL_PATH)
+    cfg.LABELS_FILE = os.path.join(script_dir, cfg.LABELS_FILE)
+    cfg.TRANSLATED_LABELS_PATH = os.path.join(script_dir, cfg.TRANSLATED_LABELS_PATH)
+    cfg.MDATA_MODEL_PATH = os.path.join(script_dir, cfg.MDATA_MODEL_PATH)
+    cfg.CODES_FILE = os.path.join(script_dir, cfg.CODES_FILE)
+    cfg.ERROR_LOG_FILE = os.path.join(script_dir, cfg.ERROR_LOG_FILE)
+    cfg.BAT_CLASSIFIER_LOCATION = os.path.join(script_dir, cfg.BAT_CLASSIFIER_LOCATION)
+    cfg.INPUT_PATH = args.i
+    cfg.OUTPUT_PATH = args.o
+def set_custom_classifier():
+    if args.classifier is None:
+        return
+    cfg.CUSTOM_CLASSIFIER = args.classifier  # we treat this as absolute path, so no need to join with dirname
+    cfg.LABELS_FILE = args.classifier.replace(".tflite", "_Labels.txt")  # same for labels file
+    cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+    args.lat = -1
+    args.lon = -1
+    # args.locale = "en"
+def add_parser_arguments():
+    parser.add_argument("--area",
+                        default="EU",
+                        help="Location. Values in ['Bavaria', 'EU', 'Sweden','Scotland', 'UK', 'USA', 'MarinCounty']. "
+                             "Defaults to Bavaria.")
+    parser.add_argument("--sensitivity",
+                        type=float,
+                        default=1.0,
+                        help="Detection sensitivity; Higher values result in higher sensitivity. "
+                             "Values in [0.5, 1.5]. Defaults to 1.0."
+                        )
+    parser.add_argument("--min_conf",
+                        type=float,
+                        default=0.7,
+                        help="Minimum confidence threshold. Values in [0.01, 0.99]. Defaults to 0.1.")
+    parser.add_argument("--overlap",
+                        type=float,
+                        default=0.0,
+                        help="Overlap of prediction segments. Values in [0.0, 2.9]. Defaults to 0.0."
+                        )
+    parser.add_argument("--rtype",
+                        default="csv",
+                        help="Specifies output format. Values in ['table', 'audacity', 'r',  'kaleidoscope', 'csv']. "
+                             "Defaults to 'csv' (Raven selection table)."
+                        )
+    parser.add_argument("--threads",
+                        type=int,
+                        default=4,
+                        help="Number of CPU threads.")
+    parser.add_argument("--batchsize",
+                        type=int,
+                        default=1,
+                        help="Number of samples to process at the same time. Defaults to 1."
+                        )
+    parser.add_argument("--sf_thresh",
+                        type=float,
+                        default=0.03,
+                        help="Minimum species occurrence frequency threshold for location filter. "
+                             "Values in [0.01, 0.99]. Defaults to 0.03."
+                        )
+    parser.add_argument("--segment",
+                        default="off",
+                        help="Generate audio files containing the detected segments. "
+                        )
+    parser.add_argument("--spectrum",
+                        default="off",
+                        help="Generate mel spectrograms files containing the detected segments. "
+                        )
+    parser.add_argument("--i",
+                        default=cfg.INPUT_PATH_SAMPLES,  # "put-your-files-here/",
+                        help="Path to input file or folder. If this is a file, --o needs to be a file too.")
+    parser.add_argument("--o",
+                        default=cfg.OUTPUT_PATH_SAMPLES,
+                        help="Path to output file or folder. If this is a file, --i needs to be a file too.")
+    parser.add_argument("--classifier",
+                        default=None,
+                        help="Path to custom trained classifier. Defaults to None. "
+                             "If set, --lat, --lon and --locale are ignored."
+                        )
+    parser.add_argument("--slist",
+                        default="",
+                        help='Path to species list file or folder. If folder is provided, species list needs to be '
+                             'named "species_list.txt". If lat and lon are provided, this list will be ignored.'
+                        )
+    parser.add_argument("--lat",
+                        type=float,
+                        default=-1,
+                        help="DISABLED. Set -1 to ignore.")
+    parser.add_argument("--lon",
+                        type=float,
+                        default=-1,
+                        help="DISABLED.  Set -1 to ignore.")
+    parser.add_argument("--week",
+                        type=int,
+                        default=-1,
+                        help="DISABLED. Set -1 for year-round species list."
+                        )
+    parser.add_argument("--locale",
+                        default="en",
+                        help="DISABLED. Defaults to 'en'."
+                        )
+def load_ebird_codes():
+    cfg.CODES = load_codes()
+    cfg.LABELS = utils.readLines(cfg.LABELS_FILE)
+def load_species_list():
+    cfg.LATITUDE, cfg.LONGITUDE, cfg.WEEK = args.lat, args.lon, args.week
+    cfg.LOCATION_FILTER_THRESHOLD = max(0.01, min(0.99, float(args.sf_thresh)))
+    script_dir = os.path.dirname(os.path.abspath(sys.argv[0]))
+    if cfg.LATITUDE == -1 and cfg.LONGITUDE == -1:
+        if not args.slist:
+            cfg.SPECIES_LIST_FILE = None
+        else:
+            cfg.SPECIES_LIST_FILE = os.path.join(script_dir, args.slist)
+            if os.path.isdir(cfg.SPECIES_LIST_FILE):
+                cfg.SPECIES_LIST_FILE = os.path.join(cfg.SPECIES_LIST_FILE, "species_list.txt")
+        cfg.SPECIES_LIST = utils.readLines(cfg.SPECIES_LIST_FILE)
+    else:
+        cfg.SPECIES_LIST_FILE = None
+        cfg.SPECIES_LIST = species.getSpeciesList(cfg.LATITUDE, cfg.LONGITUDE, cfg.WEEK, cfg.LOCATION_FILTER_THRESHOLD)
+    if not cfg.SPECIES_LIST:
+        print(f"Species list contains {len(cfg.LABELS)} species")
+    else:
+        print(f"Species list contains {len(cfg.SPECIES_LIST)} species")
+def parse_input_files():
+    if os.path.isdir(cfg.INPUT_PATH):
+        cfg.FILE_LIST = utils.collect_audio_files(cfg.INPUT_PATH)
+        print(f"Found {len(cfg.FILE_LIST)} files to analyze")
+    else:
+        cfg.FILE_LIST = [cfg.INPUT_PATH]
+def set_analysis_parameters():
+    cfg.MIN_CONFIDENCE = max(0.01, min(0.99, float(args.min_conf)))
+    cfg.SIGMOID_SENSITIVITY = max(0.5, min(1.0 - (float(args.sensitivity) - 1.0), 1.5))
+    cfg.SIG_OVERLAP = max(0.0, min(2.9, float(args.overlap)))
+    cfg.BATCH_SIZE = max(1, int(args.batchsize))
+def set_hardware_parameters():
+    if os.path.isdir(cfg.INPUT_PATH):
+        cfg.CPU_THREADS = max(1, int(args.threads))
+        cfg.TFLITE_THREADS = 1
+    else:
+        cfg.CPU_THREADS = 1
+        cfg.TFLITE_THREADS = max(1, int(args.threads))
+def load_translated_labels():
+    cfg.TRANSLATED_LABELS_PATH = cfg.TRANSLATED_BAT_LABELS_PATH
+    lfile = os.path.join(cfg.TRANSLATED_LABELS_PATH,
+                         os.path.basename(cfg.LABELS_FILE).replace(".txt", "_{}.txt".format(args.locale))
+                         )
+    if args.locale not in ["en"] and os.path.isfile(lfile):
+        cfg.TRANSLATED_LABELS = utils.readLines(lfile)
+    else:
+        cfg.TRANSLATED_LABELS = cfg.LABELS
+def check_result_type():
+    cfg.RESULT_TYPE = args.rtype.lower()
+    if cfg.RESULT_TYPE not in ["table", "audacity", "r", "kaleidoscope", "csv"]:
+        cfg.RESULT_TYPE = "csv"
+        print("Unknown output option. Using csv output.")
+if __name__ == "__main__":
+    freeze_support()  # Freeze support for executable
+    parser = argparse.ArgumentParser(description="Analyze audio files with BattyBirdNET")
+    add_parser_arguments()
+    args = parser.parse_args()
+    set_paths()
+    load_ebird_codes()
+    set_custom_classifier()
+    check_result_type()
+    set_analysis_location()
+    load_translated_labels()
+    load_species_list()
+    parse_input_files()
+    set_analysis_parameters()
+    set_hardware_parameters()
+    # Add config items to each file list entry.
+    # We have to do this for Windows which does not
+    # support fork() and thus each process has to
+    # have its own config. USE LINUX!
+    flist = [(f, cfg.get_config()) for f in cfg.FILE_LIST]
+    # Analyze files
+    if cfg.CPU_THREADS < 2:
+        for entry in flist:
+            analyze_file(entry)
+    else:
+        with Pool(cfg.CPU_THREADS) as p:
+            p.map(analyze_file, flist)
+    if args.segment == "on" or args.spectrum == "on":
+        subprocess.run(["python3", "segments.py"])
+        if args.spectrum == "on":
+            # iterate through the segements folder subfolders, call the plotter
+            print("Spectrums in progress ...")
+            script_dir = os.path.dirname(os.path.abspath(sys.argv[0]))
+            root_dir = pathlib.Path(os.path.join(script_dir, args.i + "/segments"))
+            for dir_name in os.listdir(root_dir):
+                f = os.path.join(root_dir, dir_name)
+                if not os.path.isfile(f):
+                    print("Spectrum in progres for: " + f)
+                    cmd = ['python3', "batchspec.py", f, f]
+                    subprocess.run(cmd)
+    # A few examples to test
+    # python3 analyze.py --i example/ --o example/ --slist example/ --min_conf 0.5 --threads 4
+    # python3 analyze.py --i example/soundscape.wav --o example/soundscape.BirdNET.selection.table.txt --slist example/species_list.txt --threads 8
+    # python3 analyze.py --i example/ --o example/ --lat 42.5 --lon -76.45 --week 4 --sensitivity 1.0 --rtype table --locale de

config.py ADDED Viewed

	@@ -0,0 +1,257 @@

+#################
+# Misc settings #
+#################
+# Random seed for gaussian noise
+RANDOM_SEED = 42
+##########################
+# Model paths and config #
+##########################
+# These BirdNET  models are necessary also for detecting bats as we use their embeddings and classify
+# them to identify the bats.
+# MODEL_PATH = 'checkpoints/V2.4/BirdNET_GLOBAL_6K_V2.4_Model' # This will load the protobuf model
+MODEL_PATH = 'checkpoints/V2.4/BirdNET_GLOBAL_6K_V2.4_Model_FP32.tflite'
+MDATA_MODEL_PATH = 'checkpoints/V2.4/BirdNET_GLOBAL_6K_V2.4_MData_Model_FP16.tflite'
+LABELS_FILE = 'checkpoints/V2.4/BirdNET_GLOBAL_6K_V2.4_Labels.txt'
+TRANSLATED_LABELS_PATH = 'labels/V2.4'
+TRANSLATED_BAT_LABELS_PATH = 'labels/bats/'
+# Path to custom trained classifier
+# If None, no custom classifier will be used
+# Make sure to set the LABELS_FILE above accordingly
+CUSTOM_CLASSIFIER = None
+##################
+# Audio settings #
+##################
+# BirdNET uses a sample rate of 48kHz, so the model input size is
+# (batch size, 48000 kHz * 3 seconds) = (1, 144000)
+# Recordings will be resampled automatically.
+# For bats we use: 144000 for 1 sec.
+# Note that only SIG_LENGTH * SAMPLING_RATE = 144000 combinations will work,
+# values possible e.g. 144000 240000 360000 check your classifier frequency!
+SAMPLE_RATE: int = 144000
+# We're using 1-second chunks
+SIG_LENGTH: float = 144000 / SAMPLE_RATE
+# Define overlap between consecutive chunks < SIG_LENGTH; 0 = no overlap
+SIG_OVERLAP: float = SIG_LENGTH / 4.0
+# Define minimum length of audio chunk for prediction,
+# chunks shorter than SIG_LENGTH seconds will be padded with zeros
+SIG_MINLEN: float = SIG_LENGTH / 3.0
+#####################
+# Metadata settings #
+#####################
+# These settings are currently not in use for bat detection
+LATITUDE = -1
+LONGITUDE = -1
+WEEK = -1
+LOCATION_FILTER_THRESHOLD = 0.03
+######################
+# Inference settings #
+######################
+# If None or empty file, no custom species list will be used
+# Note: Entries in this list have to match entries from the LABELS_FILE
+# We use the 2021 eBird taxonomy for species names (Clements list)
+CODES_FILE = 'eBird_taxonomy_codes_2021E.json'
+SPECIES_LIST_FILE = 'example/species_list.txt'
+# File input path and output path for selection tables
+INPUT_PATH: str = 'example/'
+OUTPUT_PATH: str = 'example/'
+# Used for bats - the files here are supposed to be analyzed by default setting
+INPUT_PATH_SAMPLES: str = 'put-your-files-here/'
+OUTPUT_PATH_SAMPLES: str = 'put-your-files-here/results/'
+BAT_CLASSIFIER_LOCATION: str = 'checkpoints/bats/v1.0'
+ALLOWED_FILETYPES = ['wav', 'flac', 'mp3', 'ogg', 'm4a']
+# Number of threads to use for inference.
+# Can be as high as number of CPUs in your system
+CPU_THREADS: int = 8
+TFLITE_THREADS: int = 6
+# False will output logits, True will convert to sigmoid activations
+APPLY_SIGMOID: bool = True
+SIGMOID_SENSITIVITY: float = 1.0
+# Minimum confidence score to include in selection table
+# (be aware: if APPLY_SIGMOID = False, this no longer represents
+# probabilities and needs to be adjusted)
+MIN_CONFIDENCE: float = 0.6
+# Number of samples to process at the same time. Higher values can increase
+# processing speed, but will also increase memory usage.
+# Might only be useful for GPU inference.
+BATCH_SIZE: int = 1
+# Specifies the output format. 'table' denotes a Raven selection table,
+# 'audacity' denotes a TXT file with the same format as Audacity timeline labels
+# 'csv' denotes a CSV file with start, end, species and confidence.
+RESULT_TYPE = 'csv'
+#####################
+# Training settings #
+#####################
+# Training data path
+TRAIN_DATA_PATH = 'train_data/'
+# Number of epochs to train for
+TRAIN_EPOCHS: int = 100
+# Batch size for training
+TRAIN_BATCH_SIZE: int = 32
+# Learning rate for training
+TRAIN_LEARNING_RATE: float = 0.01
+# Number of hidden units in custom classifier
+# If >0, a two-layer classifier will be trained
+TRAIN_HIDDEN_UNITS: int = 0
+#####################
+# Misc runtime vars #
+#####################
+CODES = {}
+LABELS: list[str] = []
+TRANSLATED_LABELS: list[str] = []
+SPECIES_LIST: list[str] = []
+ERROR_LOG_FILE: str = 'error_log.txt'
+FILE_LIST = []
+FILE_STORAGE_PATH = ''
+######################
+# Get and set config #
+######################
+def get_config():
+    return {
+        'RANDOM_SEED': RANDOM_SEED,
+        'MODEL_PATH': MODEL_PATH,
+        'MDATA_MODEL_PATH': MDATA_MODEL_PATH,
+        'LABELS_FILE': LABELS_FILE,
+        'CUSTOM_CLASSIFIER': CUSTOM_CLASSIFIER,
+        'SAMPLE_RATE': SAMPLE_RATE,
+        'SIG_LENGTH': SIG_LENGTH,
+        'SIG_OVERLAP': SIG_OVERLAP,
+        'SIG_MINLEN': SIG_MINLEN,
+        'LATITUDE': LATITUDE,
+        'LONGITUDE': LONGITUDE,
+        'WEEK': WEEK,
+        'LOCATION_FILTER_THRESHOLD': LOCATION_FILTER_THRESHOLD,
+        'CODES_FILE': CODES_FILE,
+        'SPECIES_LIST_FILE': SPECIES_LIST_FILE,
+        'INPUT_PATH': INPUT_PATH,
+        'OUTPUT_PATH': OUTPUT_PATH,
+        'CPU_THREADS': CPU_THREADS,
+        'TFLITE_THREADS': TFLITE_THREADS,
+        'APPLY_SIGMOID': APPLY_SIGMOID,
+        'SIGMOID_SENSITIVITY': SIGMOID_SENSITIVITY,
+        'MIN_CONFIDENCE': MIN_CONFIDENCE,
+        'BATCH_SIZE': BATCH_SIZE,
+        'RESULT_TYPE': RESULT_TYPE,
+        'TRAIN_DATA_PATH': TRAIN_DATA_PATH,
+        'TRAIN_EPOCHS': TRAIN_EPOCHS,
+        'TRAIN_BATCH_SIZE': TRAIN_BATCH_SIZE,
+        'TRAIN_LEARNING_RATE': TRAIN_LEARNING_RATE,
+        'TRAIN_HIDDEN_UNITS': TRAIN_HIDDEN_UNITS,
+        'CODES': CODES,
+        'LABELS': LABELS,
+        'TRANSLATED_LABELS': TRANSLATED_LABELS,
+        'SPECIES_LIST': SPECIES_LIST,
+        'ERROR_LOG_FILE': ERROR_LOG_FILE,
+        'INPUT_PATH_SAMPLES': INPUT_PATH_SAMPLES,
+        'OUTPUT_PATH_SAMPLES': OUTPUT_PATH_SAMPLES,
+        'BAT_CLASSIFIER_LOCATION': BAT_CLASSIFIER_LOCATION,
+        'TRANSLATED_BAT_LABELS_PATH': TRANSLATED_BAT_LABELS_PATH
+    }
+def set_config(c):
+    global RANDOM_SEED
+    global MODEL_PATH
+    global MDATA_MODEL_PATH
+    global LABELS_FILE
+    global CUSTOM_CLASSIFIER
+    global SAMPLE_RATE
+    global SIG_LENGTH
+    global SIG_OVERLAP
+    global SIG_MINLEN
+    global LATITUDE
+    global LONGITUDE
+    global WEEK
+    global LOCATION_FILTER_THRESHOLD
+    global CODES_FILE
+    global SPECIES_LIST_FILE
+    global INPUT_PATH
+    global OUTPUT_PATH
+    global CPU_THREADS
+    global TFLITE_THREADS
+    global APPLY_SIGMOID
+    global SIGMOID_SENSITIVITY
+    global MIN_CONFIDENCE
+    global BATCH_SIZE
+    global RESULT_TYPE
+    global TRAIN_DATA_PATH
+    global TRAIN_EPOCHS
+    global TRAIN_BATCH_SIZE
+    global TRAIN_LEARNING_RATE
+    global TRAIN_HIDDEN_UNITS
+    global CODES
+    global LABELS
+    global TRANSLATED_LABELS
+    global SPECIES_LIST
+    global ERROR_LOG_FILE
+    global INPUT_PATH_SAMPLES
+    global OUTPUT_PATH_SAMPLES
+    global BAT_CLASSIFIER_LOCATION
+    global TRANSLATED_BAT_LABELS_PATH
+    RANDOM_SEED = c['RANDOM_SEED']
+    MODEL_PATH = c['MODEL_PATH']
+    MDATA_MODEL_PATH = c['MDATA_MODEL_PATH']
+    LABELS_FILE = c['LABELS_FILE']
+    CUSTOM_CLASSIFIER = c['CUSTOM_CLASSIFIER']
+    SAMPLE_RATE = c['SAMPLE_RATE']
+    SIG_LENGTH = c['SIG_LENGTH']
+    SIG_OVERLAP = c['SIG_OVERLAP']
+    SIG_MINLEN = c['SIG_MINLEN']
+    LATITUDE = c['LATITUDE']
+    LONGITUDE = c['LONGITUDE']
+    WEEK = c['WEEK']
+    LOCATION_FILTER_THRESHOLD = c['LOCATION_FILTER_THRESHOLD']
+    CODES_FILE = c['CODES_FILE']
+    SPECIES_LIST_FILE = c['SPECIES_LIST_FILE']
+    INPUT_PATH = c['INPUT_PATH']
+    OUTPUT_PATH = c['OUTPUT_PATH']
+    CPU_THREADS = c['CPU_THREADS']
+    TFLITE_THREADS = c['TFLITE_THREADS']
+    APPLY_SIGMOID = c['APPLY_SIGMOID']
+    SIGMOID_SENSITIVITY = c['SIGMOID_SENSITIVITY']
+    MIN_CONFIDENCE = c['MIN_CONFIDENCE']
+    BATCH_SIZE = c['BATCH_SIZE']
+    RESULT_TYPE = c['RESULT_TYPE']
+    TRAIN_DATA_PATH = c['TRAIN_DATA_PATH']
+    TRAIN_EPOCHS = c['TRAIN_EPOCHS']
+    TRAIN_BATCH_SIZE = c['TRAIN_BATCH_SIZE']
+    TRAIN_LEARNING_RATE = c['TRAIN_LEARNING_RATE']
+    TRAIN_HIDDEN_UNITS = c['TRAIN_HIDDEN_UNITS']
+    CODES = c['CODES']
+    LABELS = c['LABELS']
+    TRANSLATED_LABELS = c['TRANSLATED_LABELS']
+    SPECIES_LIST = c['SPECIES_LIST']
+    ERROR_LOG_FILE = c['ERROR_LOG_FILE']
+    INPUT_PATH_SAMPLES = c['INPUT_PATH_SAMPLES']
+    OUTPUT_PATH_SAMPLES = c['OUTPUT_PATH_SAMPLES']
+    BAT_CLASSIFIER_LOCATION = c['BAT_CLASSIFIER_LOCATION']
+    TRANSLATED_BAT_LABELS_PATH = c['TRANSLATED_BAT_LABELS_PATH']

model.py ADDED Viewed

	@@ -0,0 +1,389 @@

+"""Contains functions to use the BirdNET models.
+"""
+import os
+import warnings
+import numpy as np
+import config as cfg
+os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
+os.environ["CUDA_VISIBLE_DEVICES"] = ""
+warnings.filterwarnings("ignore")
+# Import TFLite from runtime or Tensorflow;
+# import Keras if protobuf model;
+# NOTE: we have to use TFLite if we want to use
+# the metadata model or want to extract embeddings
+try:
+    import tflite_runtime.interpreter as tflite
+except ModuleNotFoundError:
+    from tensorflow import lite as tflite
+if not cfg.MODEL_PATH.endswith(".tflite"):
+    from tensorflow import keras
+INTERPRETER: tflite.Interpreter = None
+C_INTERPRETER: tflite.Interpreter = None
+M_INTERPRETER: tflite.Interpreter = None
+PBMODEL = None
+def loadModel(class_output=True):
+    """Initializes the BirdNET Model.
+    Args:
+        class_output: Omits the last layer when False.
+    """
+    global PBMODEL
+    global INTERPRETER
+    global INPUT_LAYER_INDEX
+    global OUTPUT_LAYER_INDEX
+    # Do we have to load the tflite or protobuf model?
+    if cfg.MODEL_PATH.endswith(".tflite"):
+        # Load TFLite model and allocate tensors.
+        INTERPRETER = tflite.Interpreter(model_path=cfg.MODEL_PATH, num_threads=cfg.TFLITE_THREADS)
+        INTERPRETER.allocate_tensors()
+        # Get input and output tensors.
+        input_details = INTERPRETER.get_input_details()
+        output_details = INTERPRETER.get_output_details()
+        # Get input tensor index
+        INPUT_LAYER_INDEX = input_details[0]["index"]
+        # Get classification output or feature embeddings
+        if class_output:
+            OUTPUT_LAYER_INDEX = output_details[0]["index"]
+        else:
+            OUTPUT_LAYER_INDEX = output_details[0]["index"] - 1
+    else:
+        # Load protobuf model
+        # Note: This will throw a bunch of warnings about custom gradients
+        # which we will ignore until TF lets us block them
+        PBMODEL = keras.models.load_model(cfg.MODEL_PATH, compile=False)
+def loadCustomClassifier():
+    """Loads the custom classifier."""
+    global C_INTERPRETER
+    global C_INPUT_LAYER_INDEX
+    global C_OUTPUT_LAYER_INDEX
+    # Load TFLite model and allocate tensors.
+    C_INTERPRETER = tflite.Interpreter(model_path=cfg.CUSTOM_CLASSIFIER, num_threads=cfg.TFLITE_THREADS)
+    C_INTERPRETER.allocate_tensors()
+    # Get input and output tensors.
+    input_details = C_INTERPRETER.get_input_details()
+    output_details = C_INTERPRETER.get_output_details()
+    # Get input tensor index
+    C_INPUT_LAYER_INDEX = input_details[0]["index"]
+    # Get classification output
+    C_OUTPUT_LAYER_INDEX = output_details[0]["index"]
+def loadMetaModel():
+    """Loads the model for species prediction.
+    Initializes the model used to predict species list, based on coordinates and week of year.
+    """
+    global M_INTERPRETER
+    global M_INPUT_LAYER_INDEX
+    global M_OUTPUT_LAYER_INDEX
+    # Load TFLite model and allocate tensors.
+    M_INTERPRETER = tflite.Interpreter(model_path=cfg.MDATA_MODEL_PATH, num_threads=cfg.TFLITE_THREADS)
+    M_INTERPRETER.allocate_tensors()
+    # Get input and output tensors.
+    input_details = M_INTERPRETER.get_input_details()
+    output_details = M_INTERPRETER.get_output_details()
+    # Get input tensor index
+    M_INPUT_LAYER_INDEX = input_details[0]["index"]
+    M_OUTPUT_LAYER_INDEX = output_details[0]["index"]
+def buildLinearClassifier(num_labels, input_size, hidden_units=0):
+    """Builds a classifier.
+    Args:
+        num_labels: Output size.
+        input_size: Size of the input.
+        hidden_units: If > 0, creates another hidden layer with the given number of units.
+    Returns:
+        A new classifier.
+    """
+    # import keras
+    from tensorflow import keras
+    # Build a simple one- or two-layer linear classifier
+    model = keras.Sequential()
+    # Input layer
+    model.add(keras.layers.InputLayer(input_shape=(input_size,)))
+    # Hidden layer
+    if hidden_units > 0:
+        model.add(keras.layers.Dense(hidden_units, activation="relu"))
+    # Classification layer
+    model.add(keras.layers.Dense(num_labels))
+    # Activation layer
+    model.add(keras.layers.Activation("sigmoid"))
+    return model
+def trainLinearClassifier(classifier, x_train, y_train, epochs, batch_size, learning_rate, on_epoch_end=None):
+    """Trains a custom classifier.
+    Trains a new classifier for BirdNET based on the given data.
+    Args:
+        classifier: The classifier to be trained.
+        x_train: Samples.
+        y_train: Labels.
+        epochs: Number of epochs to train.
+        batch_size: Batch size.
+        learning_rate: The learning rate during training.
+        on_epoch_end: Optional callback `function(epoch, logs)`.
+    Returns:
+        (classifier, history)
+    """
+    # import keras
+    from tensorflow import keras
+    class FunctionCallback(keras.callbacks.Callback):
+        def __init__(self, on_epoch_end=None) -> None:
+            super().__init__()
+            self.on_epoch_end_fn = on_epoch_end
+        def on_epoch_end(self, epoch, logs=None):
+            if self.on_epoch_end_fn:
+                self.on_epoch_end_fn(epoch, logs)
+    # Set random seed
+    np.random.seed(cfg.RANDOM_SEED)
+    # Shuffle data
+    idx = np.arange(x_train.shape[0])
+    np.random.shuffle(idx)
+    x_train = x_train[idx]
+    y_train = y_train[idx]
+    # Random val split
+    x_val = x_train[int(0.8 * x_train.shape[0]) :]
+    y_val = y_train[int(0.8 * y_train.shape[0]) :]
+    # Early stopping
+    callbacks = [
+        keras.callbacks.EarlyStopping(monitor="val_loss", patience=5, restore_best_weights=True),
+        FunctionCallback(on_epoch_end=on_epoch_end),
+    ]
+    # Cosine annealing lr schedule
+    lr_schedule = keras.experimental.CosineDecay(learning_rate, epochs * x_train.shape[0] / batch_size)
+    # Compile model
+    classifier.compile(
+        optimizer=keras.optimizers.Adam(learning_rate=lr_schedule),
+        loss="binary_crossentropy",
+        metrics=keras.metrics.Precision(top_k=1, name="prec"),
+    )
+    # Train model
+    history = classifier.fit(
+        x_train, y_train, epochs=epochs, batch_size=batch_size, validation_data=(x_val, y_val), callbacks=callbacks
+    )
+    return classifier, history
+def saveLinearClassifier(classifier, model_path, labels):
+    """Saves a custom classifier on the hard drive.
+    Saves the classifier as a tflite model, as well as the used labels in a .txt.
+    Args:
+        classifier: The custom classifier.
+        model_path: Path the model will be saved at.
+        labels: List of labels used for the classifier.
+    """
+    # Make folders
+    os.makedirs(os.path.dirname(model_path), exist_ok=True)
+    # Remove activation layer
+    classifier.pop()
+    # Save model as tflite
+    converter = tflite.TFLiteConverter.from_keras_model(classifier)
+    tflite_model = converter.convert()
+    open(model_path, "wb").write(tflite_model)
+    # Save labels
+    with open(model_path.replace(".tflite", "_Labels.txt"), "w") as f:
+        for label in labels:
+            f.write(label + "\n")
+def predictFilter(lat, lon, week):
+    """Predicts the probability for each species.
+    Args:
+        lat: The latitude.
+        lon: The longitude.
+        week: The week of the year [1-48]. Use -1 for yearlong.
+    Returns:
+        A list of probabilities for all species.
+    """
+    global M_INTERPRETER
+    # Does interpreter exist?
+    if M_INTERPRETER == None:
+        loadMetaModel()
+    # Prepare mdata as sample
+    sample = np.expand_dims(np.array([lat, lon, week], dtype="float32"), 0)
+    # Run inference
+    M_INTERPRETER.set_tensor(M_INPUT_LAYER_INDEX, sample)
+    M_INTERPRETER.invoke()
+    return M_INTERPRETER.get_tensor(M_OUTPUT_LAYER_INDEX)[0]
+def explore(lat: float, lon: float, week: int):
+    """Predicts the species list.
+    Predicts the species list based on the coordinates and week of year.
+    Args:
+        lat: The latitude.
+        lon: The longitude.
+        week: The week of the year [1-48]. Use -1 for yearlong.
+    Returns:
+        A sorted list of tuples with the score and the species.
+    """
+    # Make filter prediction
+    l_filter = predictFilter(lat, lon, week)
+    # Apply threshold
+    l_filter = np.where(l_filter >= cfg.LOCATION_FILTER_THRESHOLD, l_filter, 0)
+    # Zip with labels
+    l_filter = list(zip(l_filter, cfg.LABELS))
+    # Sort by filter value
+    l_filter = sorted(l_filter, key=lambda x: x[0], reverse=True)
+    return l_filter
+def flat_sigmoid(x, sensitivity=-1):
+    return 1 / (1.0 + np.exp(sensitivity * np.clip(x, -15, 15)))
+def predict(sample):
+    """Uses the main net to predict a sample.
+    Args:
+        sample: Audio sample.
+    Returns:
+        The prediction scores for the sample.
+    """
+    # Has custom classifier?
+    if cfg.CUSTOM_CLASSIFIER != None:
+        return predictWithCustomClassifier(sample)
+    global INTERPRETER
+    # Does interpreter or keras model exist?
+    if INTERPRETER == None and PBMODEL == None:
+        loadModel()
+    if PBMODEL == None:
+        # Reshape input tensor
+        INTERPRETER.resize_tensor_input(INPUT_LAYER_INDEX, [len(sample), *sample[0].shape])
+        INTERPRETER.allocate_tensors()
+        # Make a prediction (Audio only for now)
+        INTERPRETER.set_tensor(INPUT_LAYER_INDEX, np.array(sample, dtype="float32"))
+        INTERPRETER.invoke()
+        prediction = INTERPRETER.get_tensor(OUTPUT_LAYER_INDEX)
+        return prediction
+    else:
+        # Make a prediction (Audio only for now)
+        prediction = PBMODEL.predict(sample)
+        return prediction
+def predictWithCustomClassifier(sample):
+    """Uses the custom classifier to make a prediction.
+    Args:
+        sample: Audio sample.
+    Returns:
+        The prediction scores for the sample.
+    """
+    global C_INTERPRETER
+    # Does interpreter exist?
+    if C_INTERPRETER == None:
+        loadCustomClassifier()
+    # Get embeddings
+    feature_vector = embeddings(sample)
+    # Reshape input tensor
+    C_INTERPRETER.resize_tensor_input(C_INPUT_LAYER_INDEX, [len(feature_vector), *feature_vector[0].shape])
+    C_INTERPRETER.allocate_tensors()
+    # Make a prediction
+    C_INTERPRETER.set_tensor(C_INPUT_LAYER_INDEX, np.array(feature_vector, dtype="float32"))
+    C_INTERPRETER.invoke()
+    prediction = C_INTERPRETER.get_tensor(C_OUTPUT_LAYER_INDEX)
+    return prediction
+def embeddings(sample):
+    """Extracts the embeddings for a sample.
+    Args:
+        sample: Audio samples.
+    Returns:
+        The embeddings.
+    """
+    global INTERPRETER
+    # Does interpreter exist?
+    if INTERPRETER == None:
+        loadModel(False)
+    # Reshape input tensor
+    INTERPRETER.resize_tensor_input(INPUT_LAYER_INDEX, [len(sample), *sample[0].shape])
+    INTERPRETER.allocate_tensors()
+    # Extract feature embeddings
+    INTERPRETER.set_tensor(INPUT_LAYER_INDEX, np.array(sample, dtype="float32"))
+    INTERPRETER.invoke()
+    features = INTERPRETER.get_tensor(OUTPUT_LAYER_INDEX)
+    return features

requirements.txt ADDED Viewed

	@@ -0,0 +1,12 @@

+bottle==0.12.25
+gradio==3.41.0
+librosa==0.10.1
+matplotlib==3.5.3
+numpy==1.24.4
+numpy==1.24.3
+pyinstaller==5.13.0
+pywebview==4.2.2
+Requests==2.31.0
+soundfile==0.12.1
+tensorflow_macos==2.13.0
+tflite_runtime==2.13.0

segments.py ADDED Viewed

	@@ -0,0 +1,305 @@

+"""Extract segments from audio files based on BirdNET detections.
+Can be used to save the segments of the audio files for each detection.
+"""
+import argparse
+import os
+from multiprocessing import Pool
+import numpy as np
+import audio
+import config as cfg
+import utils
+# Set numpy random seed
+np.random.seed(cfg.RANDOM_SEED)
+def detectRType(line: str):
+    """Detects the type of result file.
+    Args:
+        line: First line of text.
+    Returns:
+        Either "table", "r", "kaleidoscope", "csv" or "audacity".
+    """
+    if line.lower().startswith("selection"):
+        return "table"
+    elif line.lower().startswith("filepath"):
+        return "r"
+    elif line.lower().startswith("indir"):
+        return "kaleidoscope"
+    elif line.lower().startswith("start (s)"):
+        return "csv"
+    else:
+        return "audacity"
+def parseFolders(apath: str, rpath: str, allowed_result_filetypes: list[str] = ["txt", "csv"]) -> list[dict]:
+    """Read audio and result files.
+    Reads all audio files and BirdNET output inside directory recursively.
+    Args:
+        apath: Path to search for audio files.
+        rpath: Path to search for result files.
+        allowed_result_filetypes: List of extensions for the result files.
+    Returns:
+        A list of {"audio": path_to_audio, "result": path_to_result }.
+    """
+    data = {}
+    apath = apath.replace("/", os.sep).replace("\\", os.sep)
+    rpath = rpath.replace("/", os.sep).replace("\\", os.sep)
+    # Get all audio files
+    for root, _, files in os.walk(apath):
+        for f in files:
+            if f.rsplit(".", 1)[-1].lower() in cfg.ALLOWED_FILETYPES:
+                data[f.rsplit(".", 1)[0]] = {"audio": os.path.join(root, f), "result": ""}
+    # Get all result files
+    for root, _, files in os.walk(rpath):
+        for f in files:
+            if f.rsplit(".", 1)[-1] in allowed_result_filetypes and ".bat." in f:
+                data[f.split(".bat.", 1)[0]]["result"] = os.path.join(root, f)
+    # Convert to list
+    flist = [f for f in data.values() if f["result"]]
+    print(f"Found {len(flist)} audio files with valid result file.")
+    return flist
+def parseFiles(flist: list[dict], max_segments=100):
+    """Extracts the segments for all files.
+    Args:
+        flist: List of dict with {"audio": path_to_audio, "result": path_to_result }.
+        max_segments: Number of segments per species.
+    Returns:
+        TODO @kahst
+    """
+    species_segments: dict[str, list] = {}
+    for f in flist:
+        # Paths
+        afile = f["audio"]
+        rfile = f["result"]
+        # Get all segments for result file
+        segments = findSegments(afile, rfile)
+        # Parse segments by species
+        for s in segments:
+            if s["species"] not in species_segments:
+                species_segments[s["species"]] = []
+            species_segments[s["species"]].append(s)
+    # Shuffle segments for each species and limit to max_segments
+    for s in species_segments:
+        np.random.shuffle(species_segments[s])
+        species_segments[s] = species_segments[s][:max_segments]
+    # Make dict of segments per audio file
+    segments: dict[str, list] = {}
+    seg_cnt = 0
+    for s in species_segments:
+        for seg in species_segments[s]:
+            if seg["audio"] not in segments:
+                segments[seg["audio"]] = []
+            segments[seg["audio"]].append(seg)
+            seg_cnt += 1
+    print(f"Found {seg_cnt} segments in {len(segments)} audio files.")
+    # Convert to list
+    flist = [tuple(e) for e in segments.items()]
+    return flist
+def findSegments(afile: str, rfile: str):
+    """Extracts the segments for an audio file from the results file
+    Args:
+        afile: Path to the audio file.
+        rfile: Path to the result file.
+    Returns:
+        A list of dicts in the form of
+        {"audio": afile, "start": start, "end": end, "species": species, "confidence": confidence}
+    """
+    segments: list[dict] = []
+    # Open and parse result file
+    lines = utils.readLines(rfile)
+    # Auto-detect result type
+    rtype = detectRType(lines[0])
+    # Get start and end times based on rtype
+    confidence = 0
+    start = end = 0.0
+    species = ""
+    for i, line in enumerate(lines):
+        if rtype == "table" and i > 0:
+            d = line.split("\t")
+            start = float(d[3])
+            end = float(d[4])
+            species = d[-2]
+            confidence = float(d[-1])
+        elif rtype == "audacity":
+            d = line.split("\t")
+            start = float(d[0])
+            end = float(d[1])
+            species = d[2].split(", ")[1]
+            confidence = float(d[-1])
+        elif rtype == "r" and i > 0:
+            d = line.split(",")
+            start = float(d[1])
+            end = float(d[2])
+            species = d[4]
+            confidence = float(d[5])
+        elif rtype == "kaleidoscope" and i > 0:
+            d = line.split(",")
+            start = float(d[3])
+            end = float(d[4]) + start
+            species = d[5]
+            confidence = float(d[7])
+        elif rtype == "csv" and i > 0:
+            d = line.split(",")
+            start = float(d[0])
+            end = float(d[1])
+            species = d[3]
+            confidence = float(d[4])
+        # Check if confidence is high enough
+        if confidence >= cfg.MIN_CONFIDENCE:
+            segments.append({"audio": afile, "start": start, "end": end, "species": species, "confidence": confidence})
+    return segments
+def extractSegments(item: tuple[tuple[str, list[dict]], float, dict[str]]):
+    """Saves each segment separately.
+    Creates an audio file for each species segment.
+    Args:
+        item: A tuple that contains ((audio file path, segments), segment length, config)
+    """
+    # Paths and config
+    afile = item[0][0]
+    segments = item[0][1]
+    seg_length = item[1]
+    cfg.set_config(item[2])
+    # Status
+    print(f"Extracting segments from {afile}")
+    try:
+        # Open audio file
+        sig, _ = audio.openAudioFile(afile, cfg.SAMPLE_RATE)
+    except Exception as ex:
+        print(f"Error: Cannot open audio file {afile}", flush=True)
+        utils.writeErrorLog(ex)
+        return
+    # Extract segments
+    for seg_cnt, seg in enumerate(segments, 1):
+        try:
+            # Get start and end times
+            start = int(seg["start"] * cfg.SAMPLE_RATE)
+            end = int(seg["end"] * cfg.SAMPLE_RATE)
+            offset = ((seg_length * cfg.SAMPLE_RATE) - (end - start)) // 2
+            start = max(0, start - offset)
+            end = min(len(sig), end + offset)
+            # Make sure segment is long enough
+            if end > start:
+                # Get segment raw audio from signal
+                seg_sig = sig[int(start) : int(end)]
+                # Make output path
+                outpath = os.path.join(cfg.OUTPUT_PATH, seg["species"])
+                os.makedirs(outpath, exist_ok=True)
+                # Save segment
+                seg_name = "{:.3f}_{}_{}.wav".format(
+                    seg["confidence"], seg_cnt, seg["audio"].rsplit(os.sep, 1)[-1].rsplit(".", 1)[0]
+                )
+                seg_path = os.path.join(outpath, seg_name)
+                audio.saveSignal(seg_sig, seg_path)
+        except Exception as ex:
+            # Write error log
+            print(f"Error: Cannot extract segments from {afile}.", flush=True)
+            utils.writeErrorLog(ex)
+            return False
+    return True
+if __name__ == "__main__":
+    # Parse arguments
+    parser = argparse.ArgumentParser(description="Extract segments from audio files based on BirdNET detections.")
+    parser.add_argument("--audio", default="put-your-files-here/", help="Path to folder containing audio files.")
+    parser.add_argument("--results", default="put-your-files-here/results", help="Path to folder containing result files.")
+    parser.add_argument("--o", default="put-your-files-here/segments/", help="Output folder path for extracted segments.")
+    parser.add_argument(
+        "--min_conf", type=float, default=0.1, help="Minimum confidence threshold. Values in [0.01, 0.99]. Defaults to 0.1."
+    )
+    parser.add_argument("--max_segments", type=int, default=100, help="Number of randomly extracted segments per species.")
+    parser.add_argument(
+        "--seg_length", type=float, default=3.0, help="Length of extracted segments in seconds. Defaults to 3.0."
+    )
+    parser.add_argument("--threads", type=int, default=4, help="Number of CPU threads.")
+    args = parser.parse_args()
+    # Parse audio and result folders
+    cfg.FILE_LIST = parseFolders(args.audio, args.results)
+    # Set output folder
+    cfg.OUTPUT_PATH = args.o
+    # Set number of threads
+    cfg.CPU_THREADS = int(args.threads)
+    # Set confidence threshold
+    cfg.MIN_CONFIDENCE = max(0.01, min(0.99, float(args.min_conf)))
+    # Parse file list and make list of segments
+    cfg.FILE_LIST = parseFiles(cfg.FILE_LIST, max(1, int(args.max_segments)))
+    # Add config items to each file list entry.
+    # We have to do this for Windows which does not
+    # support fork() and thus each process has to
+    # have its own config. USE LINUX!
+    flist = [(entry, max(cfg.SIG_LENGTH, float(args.seg_length)), cfg.get_config()) for entry in cfg.FILE_LIST]
+    # Extract segments
+    if cfg.CPU_THREADS < 2:
+        for entry in flist:
+            extractSegments(entry)
+    else:
+        with Pool(cfg.CPU_THREADS) as p:
+            p.map(extractSegments, flist)
+    # A few examples to test
+    # python3 segments.py --audio example/ --results example/ --o example/segments/
+    # python3 segments.py --audio example/ --results example/ --o example/segments/ --seg_length 5.0 --min_conf 0.1 --max_segments 100 --threads 4