Spaces:

aether-raid
/

atc-tts-mos

Sleeping

App Files Files Community

aether-raider commited on Nov 10

Commit

4d18ab0

1 Parent(s): 188418c

fix: sample audio issue

Browse files

Files changed (20) hide show

README.md +0 -6
backend/__pycache__/__init__.cpython-311.pyc +0 -0
backend/__pycache__/config.cpython-311.pyc +0 -0
backend/__pycache__/data_manager.cpython-311.pyc +0 -0
backend/__pycache__/hf_logging.cpython-311.pyc +0 -0
backend/__pycache__/models.cpython-311.pyc +0 -0
backend/__pycache__/session_manager.cpython-311.pyc +0 -0
backend/data_manager.py +13 -37
backend/models.py +23 -2
frontend/__pycache__/__init__.cpython-311.pyc +0 -0
frontend/__pycache__/css.cpython-311.pyc +0 -0
frontend/app.py +7 -10
frontend/pages/__pycache__/__init__.cpython-311.pyc +0 -0
frontend/pages/__pycache__/ab_gender.cpython-311.pyc +0 -0
frontend/pages/__pycache__/ab_model.cpython-311.pyc +0 -0
frontend/pages/__pycache__/conclusion.cpython-311.pyc +0 -0
frontend/pages/__pycache__/intro.cpython-311.pyc +0 -0
frontend/pages/__pycache__/mos.cpython-311.pyc +0 -0
frontend/pages/__pycache__/samples.cpython-311.pyc +0 -0
frontend/pages/__pycache__/thank_you.cpython-311.pyc +0 -0

README.md CHANGED Viewed

@@ -30,12 +30,6 @@ An interactive evaluation interface for rating Air Traffic Control (ATC) Text-to
 5. **Gender Comparison**: Compare male and female voices
 6. **Submit**: Complete the evaluation and submit your responses
-## Models Evaluated
-- **CSM**: Custom Speech Model
-- **XTTS**: XTTSv2 Model
-- **Orpheus**: Orpheus TTS Model
 ## Data Storage
 All evaluation responses are stored in the `aether-raid/atc-tts-mos-ratings` dataset for research purposes.

 5. **Gender Comparison**: Compare male and female voices
 6. **Submit**: Complete the evaluation and submit your responses
 ## Data Storage
 All evaluation responses are stored in the `aether-raid/atc-tts-mos-ratings` dataset for research purposes.

backend/__pycache__/__init__.cpython-311.pyc CHANGED Viewed

Binary files a/backend/__pycache__/__init__.cpython-311.pyc and b/backend/__pycache__/__init__.cpython-311.pyc differ

backend/__pycache__/config.cpython-311.pyc CHANGED Viewed

Binary files a/backend/__pycache__/config.cpython-311.pyc and b/backend/__pycache__/config.cpython-311.pyc differ

backend/__pycache__/data_manager.cpython-311.pyc CHANGED Viewed

Binary files a/backend/__pycache__/data_manager.cpython-311.pyc and b/backend/__pycache__/data_manager.cpython-311.pyc differ

backend/__pycache__/hf_logging.cpython-311.pyc CHANGED Viewed

Binary files a/backend/__pycache__/hf_logging.cpython-311.pyc and b/backend/__pycache__/hf_logging.cpython-311.pyc differ

backend/__pycache__/models.cpython-311.pyc CHANGED Viewed

Binary files a/backend/__pycache__/models.cpython-311.pyc and b/backend/__pycache__/models.cpython-311.pyc differ

backend/__pycache__/session_manager.cpython-311.pyc CHANGED Viewed

Binary files a/backend/__pycache__/session_manager.cpython-311.pyc and b/backend/__pycache__/session_manager.cpython-311.pyc differ

backend/data_manager.py CHANGED Viewed

@@ -25,34 +25,11 @@ class DataManager:
         self._clips: Optional[List[Clip]] = None
         self._loading = False
-    def _audio_to_data_url(self, audio_val) -> Optional[str]:
         """
-        Accepts:
-        - torchcodec AudioDecoder
-        - dict-like with 'path' / 'array' / 'sampling_rate'
-        Returns data:audio/wav;base64,... or None.
         """
-        # 1) Try to get a real file path and read it
-        try:
-            path = None
-            if isinstance(audio_val, dict) and "path" in audio_val:
-                path = audio_val["path"]
-            else:
-                # mapping-like: try __getitem__ then attribute
-                try:
-                    path = audio_val["path"]  # works on some decoders
-                except Exception:
-                    path = getattr(audio_val, "path", None)
-            if isinstance(path, str) and os.path.exists(path):
-                with open(path, "rb") as f:
-                    audio_bytes = f.read()
-                b64 = base64.b64encode(audio_bytes).decode("ascii")
-                return f"data:audio/wav;base64,{b64}"
-        except Exception as e:
-            print(f"[WARN] Failed to build data URL from path: {e}")
-        # 2) Fallback: use array + sampling_rate and render WAV in-memory
         try:
             array = None
             sr = None
@@ -60,6 +37,7 @@ class DataManager:
             if isinstance(audio_val, dict):
                 array = audio_val.get("array")
                 sr = audio_val.get("sampling_rate")
             if array is None or sr is None:
                 # try mapping-style then attributes
                 try:
@@ -69,15 +47,13 @@ class DataManager:
                     array = getattr(audio_val, "array", None)
                     sr = getattr(audio_val, "sampling_rate", None)
-            if array is not None and sr is not None and sf is not None:
-                buf = io.BytesIO()
-                sf.write(buf, np.array(array), int(sr), format="WAV")
-                b64 = base64.b64encode(buf.getvalue()).decode("ascii")
-                return f"data:audio/wav;base64,{b64}"
         except Exception as e:
-            print(f"[WARN] Failed to build data URL from array/sr: {e}")
-        print("[WARN] Could not build audio data URL for this example")
         return None
     def load_clips(self) -> List[Clip]:
@@ -97,9 +73,9 @@ class DataManager:
         for row in dataset:
             audio_val = row.get("audio")
-            audio_url = self._audio_to_data_url(audio_val)
-            if audio_url is None:
-                print(f"[WARN] Skipping clip {row.get('exercise_id')} – could not build audio URL")
                 continue
             clip = Clip(
@@ -109,7 +85,7 @@ class DataManager:
                 exercise=row["exercise"],
                 exercise_id=row["exercise_id"],
                 transcript=row["rt"],
-                audio_url=audio_url,  # string usable in <audio src="...">
             )
             clips.append(clip)

         self._clips: Optional[List[Clip]] = None
         self._loading = False
+    def _get_audio_data(self, audio_val) -> Optional[tuple]:
         """
+        Extract audio data that Gradio can handle directly.
+        Returns tuple (array, sample_rate) or None.
         """
         try:
             array = None
             sr = None
             if isinstance(audio_val, dict):
                 array = audio_val.get("array")
                 sr = audio_val.get("sampling_rate")
             if array is None or sr is None:
                 # try mapping-style then attributes
                 try:
                     array = getattr(audio_val, "array", None)
                     sr = getattr(audio_val, "sampling_rate", None)
+            if array is not None and sr is not None:
+                # Return as tuple that Gradio Audio can handle
+                return (np.array(array), int(sr))
         except Exception as e:
+            print(f"[WARN] Failed to extract audio data: {e}")
+        print("[WARN] Could not extract audio data for this example")
         return None
     def load_clips(self) -> List[Clip]:
         for row in dataset:
             audio_val = row.get("audio")
+            audio_data = self._get_audio_data(audio_val)
+            if audio_data is None:
+                print(f"[WARN] Skipping clip {row.get('exercise_id')} – could not extract audio data")
                 continue
             clip = Clip(
                 exercise=row["exercise"],
                 exercise_id=row["exercise_id"],
                 transcript=row["rt"],
+                audio_url=audio_data,  # tuple (array, sample_rate) for Gradio Audio
             )
             clips.append(clip)

backend/models.py CHANGED Viewed

@@ -18,8 +18,29 @@ def get_display_model_name(internal_name: str) -> str:
 def audio_to_base64_url(audio_data):
-    """Return the audio data URL string as-is (no-op since audio_url is already a base64 data URL)."""
-    return audio_data if isinstance(audio_data, str) else None
 # Data models

 def audio_to_base64_url(audio_data):
+    """Convert audio data to base64 URL for HTML audio elements."""
+    if isinstance(audio_data, str) and audio_data.startswith("data:audio/"):
+        return audio_data
+    elif isinstance(audio_data, tuple) and len(audio_data) == 2:
+        # Convert (array, sample_rate) tuple to base64 URL
+        try:
+            import numpy as np
+            import base64
+            import io
+            try:
+                import soundfile as sf
+            except ImportError:
+                return None
+            array, sr = audio_data
+            if sf is not None:
+                buf = io.BytesIO()
+                sf.write(buf, np.array(array), int(sr), format="WAV")
+                b64 = base64.b64encode(buf.getvalue()).decode("ascii")
+                return f"data:audio/wav;base64,{b64}"
+        except Exception as e:
+            print(f"[WARN] Failed to convert audio tuple to base64 URL: {e}")
+    return None
 # Data models

frontend/__pycache__/__init__.cpython-311.pyc CHANGED Viewed

Binary files a/frontend/__pycache__/__init__.cpython-311.pyc and b/frontend/__pycache__/__init__.cpython-311.pyc differ

frontend/__pycache__/css.cpython-311.pyc CHANGED Viewed

Binary files a/frontend/__pycache__/css.cpython-311.pyc and b/frontend/__pycache__/css.cpython-311.pyc differ

frontend/app.py CHANGED Viewed

@@ -133,16 +133,13 @@ def create_app(data_manager, session_manager):
                                     """
                                 )
-                            audio_src = audio_to_base64_url(clip.audio_url) or ""
-                            gr.HTML(
-                                f"""
-                                <div style="background: #1f2937; padding: 15px; border-radius: 8px; margin-bottom: 15px;">
-                                    <audio controls style="width: 100%; height: 54px;">
-                                        <source src="{audio_src}" type="audio/wav">
-                                        Audio not available
-                                    </audio>
-                                </div>
-                                """
                             )
                             with gr.Group(elem_classes=["transcript-box"]):

                                     """
                                 )
+                            # Use Gradio's native Audio component for better performance
+                            gr.Audio(
+                                value=clip.audio_url,
+                                label=f"Sample {i} Audio",
+                                interactive=False,
+                                show_label=False,
+                                container=False
                             )
                             with gr.Group(elem_classes=["transcript-box"]):

frontend/pages/__pycache__/__init__.cpython-311.pyc CHANGED Viewed

Binary files a/frontend/pages/__pycache__/__init__.cpython-311.pyc and b/frontend/pages/__pycache__/__init__.cpython-311.pyc differ

frontend/pages/__pycache__/ab_gender.cpython-311.pyc CHANGED Viewed

Binary files a/frontend/pages/__pycache__/ab_gender.cpython-311.pyc and b/frontend/pages/__pycache__/ab_gender.cpython-311.pyc differ

frontend/pages/__pycache__/ab_model.cpython-311.pyc CHANGED Viewed

Binary files a/frontend/pages/__pycache__/ab_model.cpython-311.pyc and b/frontend/pages/__pycache__/ab_model.cpython-311.pyc differ

frontend/pages/__pycache__/conclusion.cpython-311.pyc CHANGED Viewed

Binary files a/frontend/pages/__pycache__/conclusion.cpython-311.pyc and b/frontend/pages/__pycache__/conclusion.cpython-311.pyc differ

frontend/pages/__pycache__/intro.cpython-311.pyc CHANGED Viewed

Binary files a/frontend/pages/__pycache__/intro.cpython-311.pyc and b/frontend/pages/__pycache__/intro.cpython-311.pyc differ

frontend/pages/__pycache__/mos.cpython-311.pyc CHANGED Viewed

Binary files a/frontend/pages/__pycache__/mos.cpython-311.pyc and b/frontend/pages/__pycache__/mos.cpython-311.pyc differ

frontend/pages/__pycache__/samples.cpython-311.pyc CHANGED Viewed

Binary files a/frontend/pages/__pycache__/samples.cpython-311.pyc and b/frontend/pages/__pycache__/samples.cpython-311.pyc differ

frontend/pages/__pycache__/thank_you.cpython-311.pyc CHANGED Viewed

Binary files a/frontend/pages/__pycache__/thank_you.cpython-311.pyc and b/frontend/pages/__pycache__/thank_you.cpython-311.pyc differ